Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adr: mvp light client #311

Merged
merged 30 commits into from
May 14, 2021
Merged

adr: mvp light client #311

merged 30 commits into from
May 14, 2021

Conversation

liamsi
Copy link
Member

@liamsi liamsi commented May 3, 2021

@liamsi liamsi force-pushed the ismail/adr-mvp-rpc-light-client branch from 8f39ec5 to 9b18942 Compare May 3, 2021 23:02
@liamsi liamsi marked this pull request as ready for review May 6, 2021 11:00
@liamsi liamsi requested a review from evan-forbes May 6, 2021 11:00
@liamsi liamsi marked this pull request as draft May 6, 2021 16:11
@liamsi
Copy link
Member Author

liamsi commented May 6, 2021

Converted back to draft as I want to be a bit more precise on the IPFS integration for the DAS. Please start reviewing as this should not affect the existing sections.

docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Show resolved Hide resolved
docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved
@liamsi
Copy link
Member Author

liamsi commented May 9, 2021

A tracer implementation helping to inform this ADR happened here btw: https://github.com/lazyledger/lazyledger-core/compare/ismail/light-mvp

The "only" missing part there is to initialize and run an ipfs node to be able to pass in the CoreAPI object to sample for shares (i.e. to validate for availability). This kinda depends on #314. Other than that, it requires more testing. The tracer will become the actual implementation by adding the ipfs nodes and tests.

@liamsi liamsi marked this pull request as ready for review May 9, 2021 20:43
@liamsi liamsi requested review from Wondertan and removed request for musalbas May 10, 2021 10:05
Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pls tell me if help is needed to accelerate implementation

#### RPC

No changes to the RPC endpoints are absolutely required.
Although, for convenience and ease of use, we should either add the `DAHeader` to the existing [Commit](https://github.com/lazyledger/lazyledger-core/blob/cbf1f1a4a0472373289a9834b0d33e0918237b7f/rpc/core/routes.go#L25) endpoint, or, introduce a new endpoint to retrieve the `DAHeader` on demand and for a certain height or block hash.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think introducing a separate endpoint for DAHeader is unnecessary:

  • It will be included in other data types soon.
  • In the end, we will rely mostly on Header.DataHash instead.
  • Redundantly requesting it separately is obviously not a good idea from long-living subscriptions POV, though subscribing for the whole EventNewBlock is also a bloated approach in some use-cases and EventNewBlockHeader does not contain DAHeader.

Copy link
Member Author

@liamsi liamsi May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be included in other data types soon.

None of the existing RPC responses a tendermint light client requests will contain the DAHeader though (the BlockID in #312 is only temporary.

In the end, we will rely mostly on Header.DataHash instead.

True! "The end" is near but still some weeks away at least. That is why this adr doesn't really mention this.
We should write an ADR for going from DataHash -> DAHeader first.

Redundantly requesting it separately is obviously not a good idea from long-living subscriptions POV, though subscribing for the whole EventNewBlock is also a bloated approach in some use-cases and EventNewBlockHeader does not contain DAHeader.

The light client itself doesn't use the event system. Are you talking about the future where the light clients will subscribe to headers as you described in #86 (comment) ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you talking about the future where the light clients will subscribe to headers as you described in #86 (comment) ?

Nope. I was talking about EventBus within ll-core. The light client can subscribe to new things through RPC with EventBus as a backbone.

From my understanding, EventNewBlock and EventNewBlockHeader are the most used RPC endpoint/subscription for the light clients to be always on track with the network, and doing separate requests for DAHeader to perform DAS on every subscription message does not make sense for me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamsi, so you didn't rely on DAHeader in BlockID as temporary, but you did rely on DAHeader instead of DataHash because it weeks away? 🤔 Kinda controversial 😄, but I understand that it was not clear before that DataHash can be dropped in instead of DAHeader so I am ok with that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamsi, so you didn't rely on DAHeader in BlockID as temporary, but you did rely on DAHeader instead of DataHash because it weeks away? 🤔 Kinda controversial 😄,

What exactly is controversial about it and what are you suggesting to do about it? Please, always keep include constructive suggestions when you criticize something. Otherwise, it is not clear what improvements could be made here.

Just to clarify: Everything in this ADR is written in a way that a) does not depend on any feature that is not clearly defined or implemented yet and b) does not introduce any technical debt.

I don't really see why it is harmful but we can (very) easily remove the additional rpc end-point in the near future if we wanted to.

It is clear that the light client could simply do with the data hash but: 1) how to do this via IPFS is not defined nor implemented yet and 2) we wouldn't have been able to ship a DAS light client as quickly if we waited on that.

The light client can subscribe to new things through RPC with EventBus as a backbone.

That is a valuable and cool observation but it's still not how the current default light client operates: basically the rpc endpoint of the light client "drives" the light client (it gets updated on demand if you e.g. request anything with a newer height from the light client's RPC).

That said, for future light client implementations, we could use the event bus. If you want, you can write an ADR for that (it's not urgent but we should write an ADR for different variants for light clients we want, another one is #86 for instance).

Does this make sense?


#### Store DataAvailabilityHeader

For full nodes to be able to serve the `DataAvailabilityHeader` without having to recompute it each time, it needs to be stored somewhere.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we need to store DataAvailabilityHeader not recompute it, but to actually access block's data. You can't access block's data without DAHeader, thus you can't recompute it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things would have been so much easier if we just accessed block's data by Header.DataHash in the first place. We then would not need to store DAHeader in both Full and Light node stores, but only Header, as planned.

Copy link
Member Author

@liamsi liamsi May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like we need to store DataAvailabilityHeader not recompute it [...]

That is what the text in the ADR says 🤔 With the existing code we store the Block Data but not the DAHeader and hence fullnodes can recompute it on demand. That is far from optimal but it works for the mvp.

Things would have been so much easier if we just accessed block's data by Header.DataHash in the first place.

Well, we can still do that. It's a bit orthogonal to this ADR I guess?

We then would not need to store DAHeader in both Full and Light node stores, but only Header, as planned.

Wouldn't it still make sense to store the DAHeader. Especially for nodes that do not have the full block data?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we can still do that. It's a bit orthogonal to this ADR I guess?

Yep, that just a quick look retrospectively at how things could be done better. Nothing against this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it still make sense to store the DAHeader. Especially for nodes that do not have the full block data?

If you have DataHash you can access full DAHeader, thus you can do anything you do with DAHeader, thus storing/treating it specifically is not needed, as it would be stored somewhere on the network.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as it would be stored somewhere on the network.

That still means nodes will store the daheader in their local data stores. I guess the question here is: should we additionally store the daheader separately or not? This rather plays into adr 001 and #182. I would argue that most nodes will need the get the DAHeader and it could be useful to provide a shortcut via storing it as one blob instead of in ipld only (even if that is redundant). But maybe that is not necessary.

We will see after we (you?) have written the adr for "datahash -> DaHeader" and after we have implemented this feature. Then we can measure latencies and decide.

docs/lazy-adr/adr-004-mvp-light-client.md Show resolved Hide resolved
##### Provider

The [`Provider`](https://github.com/tendermint/tendermint/blob/7f30bc96f014b27fbe74a546ea912740eabdda74/light/provider/provider.go#L9-L26) should be changed to additionally provide the `DataAvailabilityHeader` to enable DAS light clients.
Implementations of the interface need to additionally to retrieve the `DataAvailabilityHeader` for the [modified LightBlock](#lightblock).

This comment was marked as resolved.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or they can get it from ResultCommit.SignedHeader.Commit.BlockID.DAHeader if #304 lands before MVP, not to additional request overhead

Copy link
Member Author

@liamsi liamsi May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if #304 lands before MVP,

You mean #312?

Yeah, I intentionally didn't want to rely on #312. I can't think of a use-case where the DAHeader would be requested in it's own but I guess it certainly won't harm if it can be requested. We can remove the rpc endpoint / the additional method in the provider later if we wanted. It does not introduce any technical debt.

Copy link
Member

@Wondertan Wondertan May 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean #312?

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a use-case where the DAHeader would be requested in it's own but I guess it certainly won't harm if it can be requested. We can remove the rpc endpoint / the additional method in the provider later if we wanted. It does not introduce any technical debt.

Ok

docs/lazy-adr/adr-004-mvp-light-client.md Outdated Show resolved Hide resolved

We could either augment the `LightBlock` method with a flag, add a new method solely for providing the `DataAvailabilityHeader`, or, we could introduce a new method for DAS light clients.

The latter is preferable because it is the most explicit and clear, and it still keeps places where DAS is not used without any code changes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make DA one and only default case for LL? The argument that it is clear and explicit works only if there is a need to keep both ways to get LightBlock, but why do we need both ways at the same time?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why do we need both ways at the same time?

Some light clients will still want to able to run a tendermint light client as before. They do not need the DAHeader.

Good point about the default though.

Alternatively, with the exact same result, we could embed the existing `Provider` into a new interface: e.g. `DASProvider`that add this method.
This is completely equivalent as above and which approach is better will become more clear when we spent more time on the implementation.

Regular light clients will call `LightBlock` and DAS light clients will call `DASLightBlock`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why DAS light client is not a regular light client if LL is based on DAS in the first place?

Copy link
Member Author

@liamsi liamsi May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on what the user wants: if they want to save some bandwidth and less security then they can still run a light client that does not do DAS. Maybe "regular" isn't the right word. Does "honest majority assumption light clients" or "regular tendermint light clients" work better?

@liamsi
Copy link
Member Author

liamsi commented May 10, 2021

Pls tell me if help is needed to accelerate implementation

Thanks @Wondertan! Also thanks a lot for the feedback btw! I left some comments.

So, for the implementation the following is missing:

  • init and create an ipfs node and store it as a CoreAPI object in the Client
  • testing (manual and e2e in CI)
  • documentation

Also, I think after the first tests we might discover a few bugs. I might need help fixing those (if any).

Another thing that might be interesting but I haven't fully thought through: the current KVStore abci example app could be modified slightly s.t. it understands namespaces and a user could submit (nid, data) and it would also populate the Message part in the Block.Data (this probably requires almost no changes). With that, we could have a demo app that is closer to a real LL chain (minus Proof of stake, PayForMessage, and state fraud proofs).

If any of this sounds interesting, let me know. I hope to have the first bullet-point ready by tomorrow morning.

Copy link
Member

@evan-forbes evan-forbes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have anything add, other than to say that this is great work! It makes a lot of sense, and I feel like all the necessary details are included 👍 LGreatTM #MVP 😎 🚀


In case DAS is enabled, the light client will need to:
1. retrieve the DAHeader corresponding to the data root in the Header
2. request a parameterizable number of random samples.
Copy link
Collaborator

@tac0turtle tac0turtle May 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any assumptions that change if a user sets a lower number?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question:

  • from the POV of a light client: the probability that the block is actually available decreases (lower security guarantees)
  • from the POV of the whole network: we assume a certain amount of sampling light clients to exist (would need to check the paper for some concrete numbers); if the number of light clients is too small to the light clients collectively do not sample enough data, that is indeed a problem (see Security Analysis in that very paper)

@musalbas @adlerjohn should the light client dynamically determine the number of samples from the block size and a given probability target instead? Currently, I let the user choose a number and default to 15.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The number of samples required for a probability target doesn't change significantly with the size of the block. See p21 in the fraud proofs paper:

image

@liamsi liamsi mentioned this pull request May 11, 2021
@liamsi
Copy link
Member Author

liamsi commented May 12, 2021

@Wondertan from your point of view: is there anything that should be changed before merging this? Note that we do yet have a well-established ADR process, i.e. we did not codify yet when something is really decided. So we can also amend things to the ADRs after merging. I think after the first complete implementation, we should mark the ADR as implemented and bigger changes should then require future ADRs instead.

Copy link
Member

@Wondertan Wondertan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liamsi, sorry if this PR was waiting for my approval. I should have approved it earlier, but I was not thinking that our discussions in my commenting review are blocking, as it's not a change-requesting review(red).

@liamsi
Copy link
Member Author

liamsi commented May 14, 2021

No worries, no need to rush with merging the ADR :-) Thanks for all the valuable feedback and discussions!

@liamsi liamsi merged commit 66dd732 into master May 14, 2021
@liamsi liamsi deleted the ismail/adr-mvp-rpc-light-client branch May 14, 2021 22:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants