-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lazy-adr: Add Data Availability library #170
Conversation
- provide context - notes / todos
f661921
to
1e95a71
Compare
### A Note on IPFS/IPLD | ||
|
||
In IPFS all data is _content addressed_ which basically means the data is identified by its hash. | ||
Particularly, in the LazyLedger case, the root CID identifies the Namespaced Merkle tree including all its contents (inner and leaf nodes). | ||
This means that if a `GetLeafData` request succeeds, the retrieved leaf data is in fact the leaf data in the tree. | ||
We do not need to additionally verify Merkle proofs per leaf as this will essentially be done via IPFS on each layer while | ||
resolving and getting to the leaf data. | ||
|
||
> TODO: validate this assumption and link to code that shows how this is done internally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or do we want to explicitly verify proofs either way? To not rely on the fact that ipfs in combination with our plugin handles this correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's safer and more idiot-proof if GetLeafData
only succeeds if the proof is valid. Anyway, I thought it would only succeed with IPFS is the proof is valid, with the custom hasher?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyway, I thought it would only succeed with IPFS is the proof is valid, with the custom hasher?
Yes, that is my understanding as well. For every retrieved leaf, the proof nodes should also be resolved and validated on its path down.
// The context can be used to provide a timeout. | ||
// TODO: Should there be a constant = lower bound for #samples | ||
func ValidateAvailability( | ||
ctx contex.Context, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider moving these Context
objects lower down the stack too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
````go | ||
// This constructs an IPFS node instance | ||
node, _ := core.NewNode(ctx, nodeOptions) | ||
// This attaches the Core API to the constructed node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if you don't attach a core API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could also pass around the node object directly, or simply the DAG
field's ipld.DAGService
. In the former case it would just be less pluggable (as we are passing around a concrete object instead of an interface).
// to process the leaf data the moment it was validated. | ||
// The context can be used to provide a timeout. | ||
// TODO: Should there be a constant = lower bound for #samples | ||
func ValidateAvailability( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could block for a few minutes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is definitely something that should be done asynchronously.
// Specifically all steps of the the protocol described in section | ||
// _5.2 Random Sampling and Network Block Recovery_ are carried out. | ||
// | ||
// In more detail it will first create numSamples random unique coordinates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: add that the domain for coordinates can excludes parts of the original data square (and extended rows!) based on the number of "real" shares in a block, i.e. the availableDataOriginalSharesUsed
field in the header https://github.com/lazyledger/lazyledger-specs/blob/10732d7a258a0b64dfccf96fd863830faca73ce3/specs/data_structures.md#header
// to process the leaf data the moment it was validated. | ||
// The context can be used to provide a timeout. | ||
// TODO: Should there be a constant = lower bound for #samples | ||
func ValidateAvailability( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is definitely something that should be done asynchronously.
// Note, that this method could also return the row and column roots. | ||
// Tha caller is responsible for making sure that the leaves are sorted by namespace ID. | ||
// The data will be pinned by default. | ||
func PutLeaves(ctx contex.Context, namespacedLeaves [][]byte) error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to be passing the IPFS node object, then I think PutLeaves
will need a format.NodeAdder
argument, as it will not have access to the IPFS node object.
I'm not sure if defining the API perfectly is in the scope of this ADR, but I think the API will have to change to accommodate some form of access to the IPFS node object if we're passing it around. If we use one of the alternatives mentioned, then the API specified still might have to be modified slightly, but that depends on which alternative we end up going with. |
Co-authored-by: John Adler <[email protected]>
Co-authored-by: John Adler <[email protected]>
…yledger-core into ismail/da_lib_adr
I posted a draft for the writing portion of this ADR at #178. The PR incorporates |
I mostly created this ADR such that we have a basis on which to discuss API alternatives. If no major shortcomings in the ADR, we can use it as a blueprint to start the implementation. It is definitely not complete and it won't be until we wrapped up the implementation. I do not think it is realistic to define the APIs perfectly without drafting at least "spikes" / "tracer bullet" implementations. Actually, my suggestion is to merge this if it is sound and sane from a high-level pov. Because then, you and I can make the modifications on portions of the ADR while we work on the implementation in separate branches. The moment, we want to merge a PR that implements a part of this, we need to make sure the ADR matches the implementation API (and we changed the status from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, any changes needed during implementation can be made then. Two 👍 from me. 🚀
OK, let's merge this then! I've captured a bunch of smaller todos here: #179 |
Description
This PR will contain the "DA lib" accompanied with an ADR describing the design as well as providing some context.
Rendered ADR: https://github.com/lazyledger/lazyledger-core/blob/ismail/da_lib_adr/docs/lazy-adr/adr-002-ipds-da-sampling.md
related to: #85, #163