-
Notifications
You must be signed in to change notification settings - Fork 291
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adr: mvp light client #311
Conversation
8f39ec5
to
9b18942
Compare
- clarify we need sequential verification - details on LightBlock changes
Converted back to draft as I want to be a bit more precise on the IPFS integration for the DAS. Please start reviewing as this should not affect the existing sections. |
Co-authored-by: John Adler <[email protected]>
Co-authored-by: John Adler <[email protected]>
Co-authored-by: John Adler <[email protected]>
Co-authored-by: John Adler <[email protected]>
A tracer implementation helping to inform this ADR happened here btw: https://github.com/lazyledger/lazyledger-core/compare/ismail/light-mvp The "only" missing part there is to initialize and run an ipfs node to be able to pass in the CoreAPI object to sample for shares (i.e. to validate for availability). This kinda depends on #314. Other than that, it requires more testing. The tracer will become the actual implementation by adding the ipfs nodes and tests. |
Co-authored-by: John Adler <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pls tell me if help is needed to accelerate implementation
#### RPC | ||
|
||
No changes to the RPC endpoints are absolutely required. | ||
Although, for convenience and ease of use, we should either add the `DAHeader` to the existing [Commit](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/routes.go#L25) endpoint, or, introduce a new endpoint to retrieve the `DAHeader` on demand and for a certain height or block hash. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think introducing a separate endpoint for DAHeader is unnecessary:
- It will be included in other data types soon.
- In the end, we will rely mostly on
Header.DataHash
instead. - Redundantly requesting it separately is obviously not a good idea from long-living subscriptions POV, though subscribing for the whole
EventNewBlock
is also a bloated approach in some use-cases and EventNewBlockHeader does not contain DAHeader.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be included in other data types soon.
None of the existing RPC responses a tendermint light client requests will contain the DAHeader though (the BlockID in #312 is only temporary.
In the end, we will rely mostly on Header.DataHash instead.
True! "The end" is near but still some weeks away at least. That is why this adr doesn't really mention this.
We should write an ADR for going from DataHash -> DAHeader first.
Redundantly requesting it separately is obviously not a good idea from long-living subscriptions POV, though subscribing for the whole EventNewBlock is also a bloated approach in some use-cases and EventNewBlockHeader does not contain DAHeader.
The light client itself doesn't use the event system. Are you talking about the future where the light clients will subscribe to headers as you described in #86 (comment) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about the future where the light clients will subscribe to headers as you described in #86 (comment) ?
Nope. I was talking about EventBus within ll-core. The light client can subscribe to new things through RPC with EventBus as a backbone.
From my understanding, EventNewBlock and EventNewBlockHeader are the most used RPC endpoint/subscription for the light clients to be always on track with the network, and doing separate requests for DAHeader to perform DAS on every subscription message does not make sense for me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liamsi, so you didn't rely on DAHeader in BlockID as temporary, but you did rely on DAHeader instead of DataHash because it weeks away? 🤔 Kinda controversial 😄, but I understand that it was not clear before that DataHash can be dropped in instead of DAHeader so I am ok with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liamsi, so you didn't rely on DAHeader in BlockID as temporary, but you did rely on DAHeader instead of DataHash because it weeks away? 🤔 Kinda controversial 😄,
What exactly is controversial about it and what are you suggesting to do about it? Please, always keep include constructive suggestions when you criticize something. Otherwise, it is not clear what improvements could be made here.
Just to clarify: Everything in this ADR is written in a way that a) does not depend on any feature that is not clearly defined or implemented yet and b) does not introduce any technical debt.
I don't really see why it is harmful but we can (very) easily remove the additional rpc end-point in the near future if we wanted to.
It is clear that the light client could simply do with the data hash but: 1) how to do this via IPFS is not defined nor implemented yet and 2) we wouldn't have been able to ship a DAS light client as quickly if we waited on that.
The light client can subscribe to new things through RPC with EventBus as a backbone.
That is a valuable and cool observation but it's still not how the current default light client operates: basically the rpc endpoint of the light client "drives" the light client (it gets updated on demand if you e.g. request anything with a newer height from the light client's RPC).
That said, for future light client implementations, we could use the event bus. If you want, you can write an ADR for that (it's not urgent but we should write an ADR for different variants for light clients we want, another one is #86 for instance).
Does this make sense?
|
||
#### Store DataAvailabilityHeader | ||
|
||
For full nodes to be able to serve the `DataAvailabilityHeader` without having to recompute it each time, it needs to be stored somewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we need to store DataAvailabilityHeader
not recompute it, but to actually access block's data. You can't access block's data without DAHeader
, thus you can't recompute it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things would have been so much easier if we just accessed block's data by Header.DataHash
in the first place. We then would not need to store DAHeader in both Full and Light node stores, but only Header, as planned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we need to store DataAvailabilityHeader not recompute it [...]
That is what the text in the ADR says 🤔 With the existing code we store the Block Data but not the DAHeader and hence fullnodes can recompute it on demand. That is far from optimal but it works for the mvp.
Things would have been so much easier if we just accessed block's data by Header.DataHash in the first place.
Well, we can still do that. It's a bit orthogonal to this ADR I guess?
We then would not need to store DAHeader in both Full and Light node stores, but only Header, as planned.
Wouldn't it still make sense to store the DAHeader. Especially for nodes that do not have the full block data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we can still do that. It's a bit orthogonal to this ADR I guess?
Yep, that just a quick look retrospectively at how things could be done better. Nothing against this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it still make sense to store the DAHeader. Especially for nodes that do not have the full block data?
If you have DataHash you can access full DAHeader, thus you can do anything you do with DAHeader, thus storing/treating it specifically is not needed, as it would be stored somewhere on the network.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as it would be stored somewhere on the network.
That still means nodes will store the daheader in their local data stores. I guess the question here is: should we additionally store the daheader separately or not? This rather plays into adr 001 and #182. I would argue that most nodes will need the get the DAHeader and it could be useful to provide a shortcut via storing it as one blob instead of in ipld only (even if that is redundant). But maybe that is not necessary.
We will see after we (you?) have written the adr for "datahash -> DaHeader" and after we have implemented this feature. Then we can measure latencies and decide.
##### Provider | ||
|
||
The [`Provider`](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/provider.go#L9-L26) should be changed to additionally provide the `DataAvailabilityHeader` to enable DAS light clients. | ||
Implementations of the interface need to additionally to retrieve the `DataAvailabilityHeader` for the [modified LightBlock](#lightblock). |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or they can get it from ResultCommit.SignedHeader.Commit.BlockID.DAHeader
if #304 lands before MVP, not to additional request overhead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if #304 lands before MVP,
You mean #312?
Yeah, I intentionally didn't want to rely on #312. I can't think of a use-case where the DAHeader would be requested in it's own but I guess it certainly won't harm if it can be requested. We can remove the rpc endpoint / the additional method in the provider later if we wanted. It does not introduce any technical debt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean #312?
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't think of a use-case where the DAHeader would be requested in it's own but I guess it certainly won't harm if it can be requested. We can remove the rpc endpoint / the additional method in the provider later if we wanted. It does not introduce any technical debt.
Ok
|
||
We could either augment the `LightBlock` method with a flag, add a new method solely for providing the `DataAvailabilityHeader`, or, we could introduce a new method for DAS light clients. | ||
|
||
The latter is preferable because it is the most explicit and clear, and it still keeps places where DAS is not used without any code changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not make DA one and only default case for LL? The argument that it is clear and explicit works only if there is a need to keep both ways to get LightBlock, but why do we need both ways at the same time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but why do we need both ways at the same time?
Some light clients will still want to able to run a tendermint light client as before. They do not need the DAHeader.
Good point about the default though.
Alternatively, with the exact same result, we could embed the existing `Provider` into a new interface: e.g. `DASProvider`that add this method. | ||
This is completely equivalent as above and which approach is better will become more clear when we spent more time on the implementation. | ||
|
||
Regular light clients will call `LightBlock` and DAS light clients will call `DASLightBlock`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why DAS light client is not a regular light client if LL is based on DAS in the first place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on what the user wants: if they want to save some bandwidth and less security then they can still run a light client that does not do DAS. Maybe "regular" isn't the right word. Does "honest majority assumption light clients" or "regular tendermint light clients" work better?
Co-authored-by: Hlib Kanunnikov <[email protected]>
Co-authored-by: Hlib Kanunnikov <[email protected]>
Thanks @Wondertan! Also thanks a lot for the feedback btw! I left some comments. So, for the implementation the following is missing:
Also, I think after the first tests we might discover a few bugs. I might need help fixing those (if any). Another thing that might be interesting but I haven't fully thought through: the current If any of this sounds interesting, let me know. I hope to have the first bullet-point ready by tomorrow morning. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have anything add, other than to say that this is great work! It makes a lot of sense, and I feel like all the necessary details are included 👍 LGreatTM #MVP 😎 🚀
|
||
In case DAS is enabled, the light client will need to: | ||
1. retrieve the DAHeader corresponding to the data root in the Header | ||
2. request a parameterizable number of random samples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are there any assumptions that change if a user sets a lower number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question:
- from the POV of a light client: the probability that the block is actually available decreases (lower security guarantees)
- from the POV of the whole network: we assume a certain amount of sampling light clients to exist (would need to check the paper for some concrete numbers); if the number of light clients is too small to the light clients collectively do not sample enough data, that is indeed a problem (see Security Analysis in that very paper)
@musalbas @adlerjohn should the light client dynamically determine the number of samples from the block size and a given probability target instead? Currently, I let the user choose a number and default to 15.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The number of samples required for a probability target doesn't change significantly with the size of the block. See p21 in the fraud proofs paper:
@Wondertan from your point of view: is there anything that should be changed before merging this? Note that we do yet have a well-established ADR process, i.e. we did not codify yet when something is really decided. So we can also amend things to the ADRs after merging. I think after the first complete implementation, we should mark the ADR as implemented and bigger changes should then require future ADRs instead. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liamsi, sorry if this PR was waiting for my approval. I should have approved it earlier, but I was not thinking that our discussions in my commenting review are blocking, as it's not a change-requesting review(red).
No worries, no need to rush with merging the ADR :-) Thanks for all the valuable feedback and discussions! |
Description
Rendered: https://github.com/lazyledger/lazyledger-core/blob/ismail/adr-mvp-rpc-light-client/docs/lazy-adr/adr-004-mvp-light-client.md#adr-004-data-availability-sampling-light-client
Part of #307