light-client with DAS (rpc) #307

liamsi · 2021-05-01T08:13:43Z

Context

We discussed which node type in the MVP should do DA sampling. Options were:

validators (unfeasible for a quick turnaround because as per spec validators need to download and replay the reserved namespace Txs - which makes sense but the API isn't implemented yet; we just defined the ADR in adr: add API for retrieving data for a particular namespace only #302)
full nodes (might be unfeasible for similar reasons as above: they run essentially the exact same code as validators - minus signing and proposing)
light clients (light clients in tendermint only use the RPC currently and do not run the consensus reactor - hence, it feels a bit odd to add DA proofs here: it would require light-clients to run an ipfs node and the current light client doesn't even participate in the p2p network: see light-client: p2p reactor #86 and lite: add P2P provider and corresponding P2P reactor tendermint/tendermint#4508)
implement a new node type that runs (most of) the consensus reactor can't be used as validator and only does DA checks; essentially somewhere in between a light client and a fullnode (collecting votes / running the consensus reactor and checking for DA) - a bit similar to this: lite: add P2P provider and corresponding P2P reactor tendermint/tendermint#4508

Decision

We decided that the quickest turnaround for the MVP is Option 3. This only requires us to specify the light-client UX and the additional "mode" in which it does DAS.

Action items:

@liamsi write ADR on how the (rpc) light-client gets augmented (protocol and UX)
@liamsi investigate if it makes sense to cherry-pick node: implement tendermint modes tendermint/tendermint#6241 before making any further changes (result: not necessary as light client is run with a separate flag anyways)
~~@Wondertan write ADR that defines the artifact that allows peers to download the data availably header from the (single) data root in the header~~ while we still need this (!) we can do a first implementation without this feature
~~change code s.t. peers store the merkelization of the row and column roots in IPLD too (inner and leaf nodes)~~ same as above
implement the DAS light-client mode

Note: the latter requires IPFS and that we additionally store the (regular binary) Merkle tree into IPLD - we currently only store the block data's row and column trees (the namespace Merkle trees) of the extended block. There are several ideas on this step (going from the single data root in the header -> DAHeader), which could be optimized in terms of networking latency.
One very simple idea is to store the row and column roots in flat DAG (as ipfs does by default). This would be more efficient and wouldn't even require us to define any additional IPLD plugin. The downside would be that if you download even a single particular row-root, you would also need to download all neighboring nodes (all row and column roots), to ensure the content you've downloaded actually matches the CID (or data root). Apparently, this is non-optimal for "super-light-clients". It would be good if their concrete requirements were written down more explicitly. I'm not sure we are making the right trade-offs here as the DA header is still small anyway (cc @adlerjohn @musalbas).

If I understand correctly, @Wondertan has some more sophisticated ideas for the dataroot->DAHeader downloading step. We should describe those as soon as we derisked shipping the MVP. Other optimizations include using IPLD selectors and graphsync.

Closing this and tracking the remaining tasks (see opening comment) in smaller separate issue: #378

liamsi · 2021-05-01T18:17:09Z

After a long discussion on discord if we should use vanilla ipfs DAG, an NMT, or a regular binary Merkle tree to compute the data root, we decided to go with the regular binary merkle tree:

a DAG makes super-light-client proofs undesirably large
an NMT requires some hacks around the fact that rowRoot_0, ..., rowRoot_N, colRoot_0, ..., colRoot_N taken together aren't sorted by namespace
a binary Merkle tree still allows relatively efficient super light client proofs:

You just have to give the super light client the leaf before and after the relevant row leaf for its app, to prove to it that's the only root for its name space
Alternatively, if it's app namespace is eg 3, but the row root is for 2-4, then obviously that root contains all the messages for its namespace

The exact proof format will be described in the specs but it isn't too relevant for the implementation because of the fact that data IPLD is content addressed and proofs are checked when downloading.

liamsi · 2021-05-31T08:24:49Z

Updated opening comment (closing in favour of #378)

liamsi added C:light Component: Light client C:data-availability Component: Data Availability Proofs C:ipld Access to the IPLD merkle dag labels May 1, 2021

This was referenced May 1, 2021

Spec more Merkle proofs format celestiaorg/celestia-specs#48

Open

adr: mvp light client #311

Merged

sync-by-unito bot mentioned this issue May 27, 2021

Spec more Merkle proofs format #366

Closed

3 tasks

liamsi mentioned this issue May 31, 2021

adr: DAHeader and ipfs #378

Closed

liamsi closed this as completed May 31, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

light-client with DAS (rpc) #307

light-client with DAS (rpc) #307

liamsi commented May 1, 2021 •

edited

Loading

liamsi commented May 1, 2021

liamsi commented May 31, 2021

light-client with DAS (rpc) #307

light-client with DAS (rpc) #307

Comments

liamsi commented May 1, 2021 • edited Loading

Context

Decision

liamsi commented May 1, 2021

liamsi commented May 31, 2021

liamsi commented May 1, 2021 •

edited

Loading