Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lazy-adr: Add Data Availability library #170

Merged
merged 21 commits into from
Mar 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
283 changes: 283 additions & 0 deletions docs/lazy-adr/adr-002-ipds-da-sampling.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
# LAZY ADR 002: Sampling erasure coded Block chunks

## Changelog

- 26-2-2021: Created

## Context

In Tendermint's block gossiping each peer gossips random parts of block data to peers.
For LazyLedger, we need nodes (from light-clients to validators) to be able to sample row-/column-chunks of the erasure coded
block (aka the extended data square) from the network.
This is necessary for Data Availability proofs.

![extended_square.png](img/extended_square.png)

A high-level, implementation-independent formalization of above mentioned sampling and Data Availability proofs can be found in:
[_Fraud and Data Availability Proofs: Detecting Invalid Blocks in Light Clients_](https://fc21.ifca.ai/papers/83.pdf).

For the time being, besides the academic paper, no other formalization or specification of the protocol exists.
Currently, the LazyLedger specification itself only describes the [erasure coding](https://github.com/lazyledger/lazyledger-specs/blob/master/specs/data_structures.md#erasure-coding)
and how to construct the extended data square from the block data.

This ADR:
- describes the high-level requirements
- defines the API that and how it can be used by different components of LazyLedger (block gossiping, block sync, DA proofs)
- documents decision on how to implement this.


The core data structures and the erasure coding of the block are already implemented in lazyledger-core ([#17], [#19], [#83]).
While there are no ADRs for these changes, we can refer to the LazyLedger specification in this case.
For this aspect, the existing implementation and specification should already be on par for the most part.
The exact arrangement of the data as described in this [rationale document](https://github.com/lazyledger/lazyledger-specs/blob/master/rationale/message_block_layout.md)
in the specification can happen at app-side of the ABCI boundary.
The latter was implemented in [lazyledger/lazyledger-app#21](https://github.com/lazyledger/lazyledger-app/pull/21)
leveraging a new ABCI method, added in [#110](https://github.com/lazyledger/lazyledger-core/pull/110).
This new method is a sub-set of the proposed ABCI changes aka [ABCI++](https://github.com/tendermint/spec/pull/254).

Mustafa Al-Bassam (@musalbas) implemented a [prototype](https://github.com/lazyledger/lazyledger-prototype)
whose main purpose is to realistically analyse the protocol.
Although the prototype does not make any network requests and only operates locally, it can partly serve as a reference implementation.
It uses the [rsmt2d] library.

The implementation will essentially use IPFS' APIs. For reading (and writing) chunks it
will use the IPLD [`DagService`](https://github.com/ipfs/go-ipld-format/blob/d2e09424ddee0d7e696d01143318d32d0fb1ae63/merkledag.go#L54),
more precisely the [`NodeGetter`](https://github.com/ipfs/go-ipld-format/blob/d2e09424ddee0d7e696d01143318d32d0fb1ae63/merkledag.go#L18-L27)
and [`NodeAdder`](https://github.com/ipfs/go-ipld-format/blob/d2e09424ddee0d7e696d01143318d32d0fb1ae63/merkledag.go#L29-L39).
As an optimization, we can also use a [`Batch`](https://github.com/ipfs/go-ipld-format/blob/d2e09424ddee0d7e696d01143318d32d0fb1ae63/batch.go#L29)
to batch adding and removing nodes.
This will be achieved by passing around a [CoreAPI](https://github.com/ipfs/interface-go-ipfs-core/blob/b935dfe5375eac7ea3c65b14b3f9a0242861d0b3/coreapi.go#L15)
object, which derive from the IPFS node which is created along a with a tendermint node (see [#152]).
This code snippet does exactly that (see the [go-ipfs documentation] for more examples):
```go
// This constructs an IPFS node instance
node, _ := core.NewNode(ctx, nodeOptions)
// This attaches the Core API to the constructed node
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if you don't attach a core API?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also pass around the node object directly, or simply the DAG field's ipld.DAGService. In the former case it would just be less pluggable (as we are passing around a concrete object instead of an interface).

coreApi := coreapi.NewCoreAPI(node)
```

The above mentioned IPLD methods operate on so called [ipld.Nodes].
When computing the data root, we can pass in a [`NodeVisitor`](https://github.com/lazyledger/nmt/blob/b22170d6f23796a186c07e87e4ef9856282ffd1a/nmt.go#L22)
into the Namespaced Merkle Tree library to create these (each inner- and leaf-node in the tree becomes an ipld node).
As a peer that requests such an IPLD node, the LazyLedger IPLD plugin provides the [function](https://github.com/lazyledger/lazyledger-core/blob/ceb881a177b6a4a7e456c7c4ab1dd0eb2b263066/p2p/ipld/plugin/nodes/nodes.go#L175)
`NmtNodeParser` to transform the retrieved raw data back into an `ipld.Node`.

A more high-level description on the changes required to rip out the current block gossiping routine,
including changes to block storage-, RPC-layer, and potential changes to reactors is either handled in [LAZY ADR 001](./adr-001-block-propagation.md),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broken link to ADR-001

Copy link
Member Author

@liamsi liamsi Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be valid once #163 gets merged (which should happen soon).

and/or in a few smaller, separate followup ADRs.

## Alternative Approaches

Instead of creating a full IPFS node object and passing it around as explained above
- use API (http)
- use ipld-light
- use alternative client

Also, for better performance
- use [graph-sync], [IPLD selectors], e.g. via [ipld-prime]

Also, there is the idea, that nodes only receive the [Header] with the data root only
and, in an additional step/request, download the DA header using the library, too.
While this feature is not considered here, and we assume each node that uses this library has the DA header, this assumption
is likely to change when flesh out other parts of the system in more detail.
Note that this also means that light clients would still need to validate that the data root and merkelizing the DA header yield the same result.

## Decision

> This section records the decision that was made.
> It is best to record as much info as possible from the discussion that happened. This aids in not having to go back to the Pull Request to get the needed information.

> - TODO: briefly summarize github, discord, and slack discussions (?)
> - also mention Mustafa's prototype and compare both apis briefly (RequestSamples, RespondSamples, ProcessSamplesResponse)
> - mention [ipld experiments]



## Detailed Design

Add a package to the library that provides the following features:
1. sample a given number of random row/col indices of extended data square given a DA header and indicate if successful or timeout/other error occurred
2. store the block in the network by adding it to the peer's local Merkle-DAG whose content is discoverable via a DHT
3. store the sampled chunks in the network
4. reconstruct the whole block from a given DA header
5. get all messages of a particular namespace ID.

We mention 5. here mostly for completeness. Its details will be described / implemented in a separate ADR / PR.

Apart from the above mentioned features, we informally collect additional requirements:
- where randomness is needed, the randomness source should be configurable
- all replies by the network should be verified if this is not sufficiently covered by the used libraries already (IPFS)
- where possible, the requests to the network should happen in parallel (without DoSing the proposer for instance).

This library should be implemented as two new packages:

First, a sub-package should be added to the layzledger-core [p2p] package
liamsi marked this conversation as resolved.
Show resolved Hide resolved
which does not know anything about the core data structures (Block, DA header etc).
It handles the actual network requests to the IPFS network and operates on IPFS/IPLD objects
directly and hence should live under [p2p/ipld].
liamsi marked this conversation as resolved.
Show resolved Hide resolved
To a some extent this part of the stack already exists.

Second, a high-level API that can "live" closer to the actual types, e.g., in a sub-package in [lazyledger-core/types]
liamsi marked this conversation as resolved.
Show resolved Hide resolved
or in a new sub-package `da`.

We first describe the high-level library here and describe functions in
more detail inline with their godoc comments below.

### API that operates on lazyledger-core types

As mentioned above this part of the library has knowledge of the core types (and hence depends on them).
It does not deal with IPFS internals.

```go
// ValidateAvailability implements the protocol described in https://fc21.ifca.ai/papers/83.pdf.
// Specifically all steps of the protocol described in section
// _5.2 Random Sampling and Network Block Recovery_ are carried out.
//
// In more detail it will first create numSamples random unique coordinates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: add that the domain for coordinates can excludes parts of the original data square (and extended rows!) based on the number of "real" shares in a block, i.e. the availableDataOriginalSharesUsed field in the header https://github.com/lazyledger/lazyledger-specs/blob/10732d7a258a0b64dfccf96fd863830faca73ce3/specs/data_structures.md#header

// Then, it will ask the network for the leaf data corresponding to these coordinates.
// Additionally to the number of requests, the caller can pass in a callback,
// which will be called on for each retrieved leaf with a verified Merkle proof.
//
// Among other use-cases, the callback can be useful to monitoring (progress), or,
// to process the leaf data the moment it was validated.
// The context can be used to provide a timeout.
// TODO: Should there be a constant = lower bound for #samples
func ValidateAvailability(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could block for a few minutes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is definitely something that should be done asynchronously.

ctx contex.Context,
Comment on lines +143 to +146
Copy link
Member Author

@liamsi liamsi Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider moving these Context objects lower down the stack too.

dah *DataAvailabilityHeader,
numSamples int,
leafSucessCb func(namespacedleaf []byte),
) error { /* ... */}

// RetrieveBlockData can be used to recover the block Data.
// It will carry out a similar protocol as described for ValidateAvailability.
// The key difference is that it will sample enough chunks until it can recover the
// full extended data square, including original data (e.g. by using rsmt2d.RepairExtendedDataSquare).
func RetrieveBlockData(ctx contex.Context, dah *DataAvailabilityHeader) (types.Data, error) {/* ... */}

// PutBlock operates directly on the Block.
// It first computes the erasure coding, aka the extended data square.
// Row by row ir calls a lower level library which handles adding the
// the row to the Merkle Dag, in our case a Namespaced Merkle Tree.
// Note, that this method could also fill the DA header.
// The data will be pinned by default.
func (b *Block) PutBlock(ctx contex.Context) error
```

We now describe the lower-level library that will be used by above methods.
Again we provide more details inline in the godoc comments directly.

`PutBlock` is a method on `Block` as the erasure coding can then be cached, e.g. in a private field
in the block.

### Changes to the lower level API closer to IPFS (p2p/ipld)

```go
// GetLeafData takes in a Namespaced Merkle tree root transformed into a Cid
// and the leaf index to retrieve.
// Callers also need to pass in the total number of leaves of that tree.
// Internally, this will be translated to a IPLD path and corresponds to
// an ipfs dag get request, e.g. namespacedCID/0/1/0/0/1.
// The retrieved data should be pinned by default.
func GetLeafData(
ctx context.Context,
rootCid cid.Cid,
leafIndex uint32,
totalLeafs uint32, // this corresponds to the extended square width
) ([]byte, error)

// PutLeaves takes the namespaced leaves, a row of the from the extended data square,
// and calls nodes.DataSquareRowOrColumnRawInputParser of the ipld plugin.
// The resulting ipld nodes are passed to a Batch calling AddMany:
// https://github.com/ipfs/go-ipld-format/blob/d2e09424ddee0d7e696d01143318d32d0fb1ae63/batch.go#L29
// Note, that this method could also return the row and column roots.
// Tha caller is responsible for making sure that the leaves are sorted by namespace ID.
// The data will be pinned by default.
func PutLeaves(ctx contex.Context, namespacedLeaves [][]byte) error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to be passing the IPFS node object, then I think PutLeaves will need a format.NodeAdder argument, as it will not have access to the IPFS node object.

```

`GetLeafData` can be used by above `ValidateAvailability` and `RetrieveBlock` and
`PutLeaves` by `PutBlock`.

### A Note on IPFS/IPLD

In IPFS all data is _content addressed_ which basically means the data is identified by its hash.
Particularly, in the LazyLedger case, the root CID identifies the Namespaced Merkle tree including all its contents (inner and leaf nodes).
This means that if a `GetLeafData` request succeeds, the retrieved leaf data is in fact the leaf data in the tree.
We do not need to additionally verify Merkle proofs per leaf as this will essentially be done via IPFS on each layer while
resolving and getting to the leaf data.

> TODO: validate this assumption and link to code that shows how this is done internally
Comment on lines +202 to +210
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or do we want to explicitly verify proofs either way? To not rely on the fact that ipfs in combination with our plugin handles this correctly?

cc @musalbas @adlerjohn

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safer and more idiot-proof if GetLeafDataonly succeeds if the proof is valid. Anyway, I thought it would only succeed with IPFS is the proof is valid, with the custom hasher?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I thought it would only succeed with IPFS is the proof is valid, with the custom hasher?

Yes, that is my understanding as well. For every retrieved leaf, the proof nodes should also be resolved and validated on its path down.


### Implementation plan

As fully integrating Data Available proofs into tendermint, is a rather larger change we break up the work into the
following packages (not mentioning the implementation work that was already done):

1. Flesh out the changes in the consensus messages ([lazyledger-specs#126], [lazyledger-specs#127])
2. Flesh out the changes that would be necessary to replace the current block gossiping ([LAZY ADR 001](./adr-001-block-propagation.md))
3. Add the possibility of storing and retrieving block data (samples or whole block) to lazyledger-core (this ADR and related PRs).
4. Integrate above API (3.) as an addition into lazyledger-core without directly replacing the tendermint counterparts (block gossip etc).
5. Rip out each component that will be redundant with above integration in one or even several smaller PRs:
- block gossiping (see LAZY ADR 001)
- modify block store (see LAZY ADR 001)
- make downloading full Blocks optional (flag/config)
- route some RPC requests to IPFS (see LAZY ADR 001)


## Status

Proposed

## Consequences

### Positive

- simplicity & ease of implementation
- can re-use an existing networking and p2p stack (go-ipfs)
- potential support of large, cool, and helpful community
- high-level API definitions independent of the used stack

### Negative

- latency
- being connected to the public IPFS network might be overkill if peers should in fact only care about a subset that participates in the LazyLedger protocol
- dependency on a large code-base with lots of features and options of which we only need a small subset of

### Neutral
- two different p2p layers exist in lazyledger-core

## References

- https://github.com/lazyledger/lazyledger-core/issues/85
- https://github.com/lazyledger/lazyledger-core/issues/167

- https://docs.ipld.io/#nodes
- https://arxiv.org/abs/1809.09044
- https://fc21.ifca.ai/papers/83.pdf
- https://github.com/tendermint/spec/pull/254


[#17]: https://github.com/lazyledger/lazyledger-core/pull/17
[#19]: https://github.com/lazyledger/lazyledger-core/pull/19
[#83]: https://github.com/lazyledger/lazyledger-core/pull/83

[#152]: https://github.com/lazyledger/lazyledger-core/pull/152

[lazyledger-specs#126]: https://github.com/lazyledger/lazyledger-specs/issues/126
[lazyledger-specs#127]: https://github.com/lazyledger/lazyledger-specs/pulls/127
[Header]: https://github.com/lazyledger/lazyledger-specs/blob/master/specs/data_structures.md#header

[go-ipfs documentation]: https://github.com/ipfs/go-ipfs/tree/master/docs/examples/go-ipfs-as-a-library#use-go-ipfs-as-a-library-to-spawn-a-node-and-add-a-file
[ipld experiments]: https://github.com/lazyledger/ipld-plugin-experiments
[ipld.Nodes]: https://github.com/ipfs/go-ipld-format/blob/d2e09424ddee0d7e696d01143318d32d0fb1ae63/format.go#L22-L45
[graph-sync]: https://github.com/ipld/specs/blob/master/block-layer/graphsync/graphsync.md
[IPLD selectors]: https://github.com/ipld/specs/blob/master/selectors/selectors.md
[ipld-prime]: https://github.com/ipld/go-ipld-prime

[rsmt2d]: https://github.com/lazyledger/rsmt2d


[p2p]: https://github.com/lazyledger/lazyledger-core/tree/0eccfb24e2aa1bb9c4428e20dd7828c93f300e60/p2p
[p2p/ipld]: https://github.com/lazyledger/lazyledger-core/tree/0eccfb24e2aa1bb9c4428e20dd7828c93f300e60/p2p/ipld
[lazyledger-core/types]: https://github.com/lazyledger/lazyledger-core/tree/0eccfb24e2aa1bb9c4428e20dd7828c93f300e60/types
Binary file added docs/lazy-adr/img/extended_square.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.