diff --git a/neps/nep-0568.md b/neps/nep-0568.md index cdf15490d..ca63287f3 100644 --- a/neps/nep-0568.md +++ b/neps/nep-0568.md @@ -163,6 +163,16 @@ supporting smooth transitions without altering storage structures directly. ### Stateless Validation +### State Witness + +Resharding state transition becomes one of `implicit_transitions` in `ChunkStateWitness`. It must be validated between processing last chunk (potentially missing) in the old epoch and the first chunk (potentially missing) in the new epoch. `ChunkStateTransition` fields also nicely correspond to the resharding state transition: in `block_hash` we store the hash of the last block of the parent shard, in `base_state` we store the resharding proof, and in `post_state_root` we store the proposed state root. + +Note that it leads to **two** state transitions corresponding to the same block hash. On the chunk producer side, the first transition is stored for the `(block_hash, parent_shard_uid)` pair and the second one is stored for the `(block_hash, new_shard_uid)` pair. + +The chunk validator has all the blocks, so it identifies whether implicit transition corresponds to applying missing chunk or resharding independently. This is implemented in `get_state_witness_block_range`, which iterates from `state_witness.chunk_header.prev_block_hash()` to the block with includes last last chunk for the (parent) shard, if it exists. + +Then, on `validate_chunk_state_witness`, if implicit transition corresponds to resharding, chunk validator calls `retain_split_shard` and proves state transition from parent to child shard. + ### State Sync Changes to the state sync protocol aren't typically conisdered protocol changes requiring a version bump, since it's concerned with downloading state that isn't present locally, rather than with the rules of execution of blocks and chunks. But it might still be helpful to outline some planned changes to state sync intended to make the resharding implementation easier to work with. @@ -187,18 +197,39 @@ In this NEP, we propose updating the ShardId semantics to allow for arbitrary id ## Reference Implementation -```text -[This technical section is required for Protocol proposals but optional for other categories. A draft implementation should demonstrate a minimal implementation that assists in understanding or implementing this proposal. Explain the design in sufficient detail that: +``` +should_split_shard(block): + shard_layout = epoch_manager.shard_layout(block.epoch_id()) + next_shard_layout = epoch_manager.shard_layout(block.next_epoch_id()) + return epoch_manager.is_next_block_epoch_start(block) && shard_layout != next_shard_layout -* Its interaction with other features is clear. -* Where possible, include a Minimum Viable Interface subsection expressing the required behavior and types in a target programming language. (ie. traits and structs for rust, interfaces and classes for javascript, function signatures and structs for c, etc.) -* It is reasonably clear how the feature would be implemented. -* Corner cases are dissected by example. -* For protocol changes: A link to a draft PR on nearcore that shows how it can be integrated in the current code. It should at least solve the key technical challenges. +on chain.postprocess_block(block): + next_shard_layout = epoch_manager.shard_layout(block.next_epoch_id()) + if should_split_shard(block): + resharding_manager.split_shard(split_shard_event, next_shard_layout) -The section should return to the examples given in the previous section, and explain more fully how the detailed proposal makes those examples work.] +on resharding_manager.split_shard(split_shard_event, next_shard_layout): + set State mapping + start FlatState resharding + process MemTrie resharding: + freeze MemTrie, create HybridMemTries + for each child shard: + mem_tries[shard].retain_split_shard(boundary_account) + +mem_trie.retain_split_shard(boundary_account): + split shard by path as described above while generating the proof + saving the proof as state transition for pair (block, new_shard_uid) + +then, the proof is sent as one of implicit transitions in ChunkStateWitness + +then, on chunk validation path, chunk validator understands if resharding is a part of state transition, using the same should_split_shard condition + +and then it calls Trie(state_transition_proof).retain_split_shard(boundary_account) which should succeed if proof is sufficient and generates new state root + +finally, it checks that the new state root matches the state root proposed in ChunkStateWitness. if the whole ChunkStateWitness is valid, then chunk validator sends endorsement which also endorses the resharding. ``` + ### State Storage - MemTrie The current implementation of MemTrie uses a pool of memory (`STArena`) to allocate and deallocate nodes and internal pointers in this pool to reference child nodes. MemTries, unlike the State representation of Trie, do not work with the hash of the nodes but internal memory pointers directly. Additionally, MemTries are not thread safe and one MemTrie exists per shard. @@ -296,7 +327,7 @@ Elements inherited by both children: Elements inherited only be the lowest index child: -* `BUFFERED_RECEIPT_INDICES ` +* `BUFFERED_RECEIPT_INDICES` * `BUFFERED_RECEIPT` #### Bring children shards up to date with the chain's head @@ -410,15 +441,30 @@ The state sync algorithm defines a `sync_hash` that is used in many parts of the ## Security Implications -```text -[Explicitly outline any security concerns in relation to the NEP, and potential ways to resolve or mitigate them. At the very least, well-known relevant threats must be covered, e.g. person-in-the-middle, double-spend, XSS, CSRF, etc.] -``` +### Fork Handling + +In theory, it can happen that there will be more than one candidate block which finishes the last epoch with old shard layout. For previous implementations it didn't matter because resharding decision was made in the beginning previous epoch. Now, the decision is made on the epoch boundary, so the new implementation handles this case as well. + +### Proof Validation + +With single shard tracking, nodes can't independently validate new state roots after resharding, because they don't have state of shard being split. That's why we generate resharding proofs, whose generation and validation may be a new weak point. However, `retain_split_shard` is equivalent to constant number of lookups in the trie, so its overhead its negligible. Even if proof is invalid, it will only imply that `retain_split_shard` fails early, similarly to other state transitions. ## Alternatives -```text -[Explain any alternative designs that were considered and the rationale for not choosing them. Why your design is superior?] -``` +In the solution space which would keep blockchain stateful, we also considered an alternative to handle resharding through mechanism of `Receipts`. The workflow would be to: +* create empty `target_shard`, +* require `source_shard` chunk producers to create special `ReshardingReceipt(source_shard, target_shard, data)` where `data` would be an interval of key-value pairs in `source_shard` alongside with the proof, +* then, `target_shard` trackers and validators would process that receipt, validate the proof and insert the key-value pairs into the new shard. + +However, `data` would occupy most of the whole state witness capacity and introduce overhead of proving every single interval in `source_shard`. Moreover, approach to sync target shard "dynamically" also requires some form of catchup, which makes it much less feasible than chosen approach. + +Another question is whether we should tie resharding to epoch boundaries. This would allow to come from resharding decision to completion much faster. But for that, we would need to: +* agree if we should reshard in the middle of the epoch or allow "fast epoch completion" which has to be implemented, +* keep chunk producers tracking "spare shards" ready to receive items from split shards, +* on resharding event, implement specific form of state sync, on which source and target chunk producers would agree on new state roots offline, +* then, new state roots would be validated by chunk validators in the same fashion. + +While it is much closer to Dynamic Resharding (below), it requires much more changes to the protocol. And the considered idea works very well as intermediate step to that, if needed. ## Future possibilities @@ -428,27 +474,22 @@ The state sync algorithm defines a `sync_hash` that is used in many parts of the ## Consequences -```text -[This section describes the consequences, after applying the decision. All consequences should be summarized here, not just the "positive" ones. Record any concerns raised throughout the NEP discussion.] -``` - ### Positive -* p1 +* The protocol is able to execute resharding even while only a fraction of nodes track the split shard. +* New resharding can happen in the matter of minutes instead of hours. ### Neutral -* n1 +N/A ### Negative -* n1 +* The storage components need to handle additional complexity of controlling the shard layout change. ### Backwards Compatibility -```text -[All NEPs that introduce backwards incompatibilities must include a section describing these incompatibilities and their severity. Author must explain a proposes to deal with these incompatibilities. Submissions without a sufficient backwards compatibility treatise may be rejected outright.] -``` +Approach is fully backwards compatible, just adding new protocol upgrade on top of existing implementation. Also, we were able to completely remove previous resharding logic, as it was already approved by validators, and to process chunks from any layout, it is enough to take state from that layout from archival node. ## Unresolved Issues (Optional)