Skip to content

Commit

Permalink
feedback
Browse files Browse the repository at this point in the history
  • Loading branch information
Longarithm committed Nov 22, 2024
1 parent 12cdb9f commit 4e58705
Showing 1 changed file with 61 additions and 17 deletions.
78 changes: 61 additions & 17 deletions neps/nep-0568.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,28 @@ supporting smooth transitions without altering storage structures directly.

### Stateless Validation

As only a fraction of nodes track the split shard, there is a need to prove the transition from state root of parent shard
to new state roots for children shards to other validators.
Otherwise the chunk producers for split shard may collude and provide invalid state roots,
which may compromise the protocol, for example, with minting tokens out of thin air.

The design allows to generate and check this state transition in the time, negligible compared to the time it takes to apply chunk.
As shown above in [State Storage - MemTrie](#state-storage---memtrie) section, generation and verification logic consists of constant number of trie lookups.
More specifically, we implement `retain_split_shard(boundary_account, RetainMode::{Left, Right})` method for trie, which leaves only keys in trie that
belong to the left or right child shard.
Inside, we implement `retain_multi_range(intervals)` method, where `intervals` is a vector of trie key intervals to retain.
Each interval corresponds to unique trie key type prefix byte (`Account`, `AccessKey`, etc.) and either defines an interval from empty key to `boundary_account` key for left shard, or from `boundary_account` to infinity for right shard.
`retain_multi_range` is recursive. Based on current trie key prefix covered by current node, it either:

* returns node back, if subtree is fully contained within some interval;
* returns "empty" node, if subtree is outside of all intervals;
* otherwise, descends into all children and constructs new node with children returned by recursive calls.

Implementation is agnostic to the trie storage used for retrieving nodes, it applies to both memtrie and partial storage (state proof).

* calling it for memtrie generates a proof and new state root;
* calling it for partial storage generates a new state root. If method doesn't fail with error that node wasn't found in the proof, it means that proof was sufficient, and it remains to compare generated state root with the one proposed by chunk producer.

### State Witness

Resharding state transition becomes one of `implicit_transitions` in `ChunkStateWitness`. It must be validated between processing last chunk (potentially missing) in the old epoch and the first chunk (potentially missing) in the new epoch. `ChunkStateTransition` fields also nicely correspond to the resharding state transition: in `block_hash` we store the hash of the last block of the parent shard, in `base_state` we store the resharding proof, and in `post_state_root` we store the proposed state root.
Expand Down Expand Up @@ -197,38 +219,58 @@ In this NEP, we propose updating the ShardId semantics to allow for arbitrary id

## Reference Implementation

```
should_split_shard(block):
### Overview
<!-- markdownlint-disable MD029 -->

1. Any node tracking shard must determine if it should split shard in the last block before the epoch where resharding should happen.

```pseudocode
should_split_shard(block, shard_id):
shard_layout = epoch_manager.shard_layout(block.epoch_id())
next_shard_layout = epoch_manager.shard_layout(block.next_epoch_id())
return epoch_manager.is_next_block_epoch_start(block) && shard_layout != next_shard_layout
if epoch_manager.is_next_block_epoch_start(block) &&
shard_layout != next_shard_layout &&
next_shard_layout.shard_split_map().contains(shard_id):
return Some(next_shard_layout.split_shard_event(shard_id))
return None
```

2. This logic is triggered on block postprocessing, which means that block is valid and is being persisted to disk.

```pseudocode
on chain.postprocess_block(block):
next_shard_layout = epoch_manager.shard_layout(block.next_epoch_id())
if should_split_shard(block):
resharding_manager.split_shard(split_shard_event, next_shard_layout)
if let Some(split_shard_event) = should_split_shard(block, shard_id):
resharding_manager.split_shard(split_shard_event)
```

3. The event triggers changes in all state storage components.

```pseudocode
on resharding_manager.split_shard(split_shard_event, next_shard_layout):
set State mapping
start FlatState resharding
process MemTrie resharding:
freeze MemTrie, create HybridMemTries
for each child shard:
mem_tries[shard].retain_split_shard(boundary_account)
mem_trie.retain_split_shard(boundary_account):
split shard by path as described above while generating the proof
saving the proof as state transition for pair (block, new_shard_uid)
mem_tries[parent_shard].retain_split_shard(boundary_account)
```

then, the proof is sent as one of implicit transitions in ChunkStateWitness
4. `retain_split_shard` leaves only keys in trie that belong to the left or right child shard.
It retains trie key intervals for left or right child as described above. Simultaneously the proof is generated.
In the end, we get new state root, hybrid memtrie corresponding to child shard, and the proof.
Proof is saved as state transition for pair `(block, new_shard_uid)`.

then, on chunk validation path, chunk validator understands if resharding is a part of state transition, using the same should_split_shard condition
5. The proof is sent as one of implicit transitions in ChunkStateWitness.

and then it calls Trie(state_transition_proof).retain_split_shard(boundary_account) which should succeed if proof is sufficient and generates new state root
6. On chunk validation path, chunk validator understands if resharding is
a part of state transition, using the same `should_split_shard` condition.

finally, it checks that the new state root matches the state root proposed in ChunkStateWitness. if the whole ChunkStateWitness is valid, then chunk validator sends endorsement which also endorses the resharding.
```
7. It calls `Trie(state_transition_proof).retain_split_shard(boundary_account)`
which should succeed if proof is sufficient and generates new state root.

8. Finally, it checks that the new state root matches the state root proposed in `ChunkStateWitness`.
If the whole `ChunkStateWitness` is valid, then chunk validator sends endorsement which also endorses the resharding.

### State Storage - MemTrie

Expand Down Expand Up @@ -452,13 +494,15 @@ With single shard tracking, nodes can't independently validate new state roots a
## Alternatives

In the solution space which would keep blockchain stateful, we also considered an alternative to handle resharding through mechanism of `Receipts`. The workflow would be to:

* create empty `target_shard`,
* require `source_shard` chunk producers to create special `ReshardingReceipt(source_shard, target_shard, data)` where `data` would be an interval of key-value pairs in `source_shard` alongside with the proof,
* then, `target_shard` trackers and validators would process that receipt, validate the proof and insert the key-value pairs into the new shard.

However, `data` would occupy most of the whole state witness capacity and introduce overhead of proving every single interval in `source_shard`. Moreover, approach to sync target shard "dynamically" also requires some form of catchup, which makes it much less feasible than chosen approach.

Another question is whether we should tie resharding to epoch boundaries. This would allow to come from resharding decision to completion much faster. But for that, we would need to:

* agree if we should reshard in the middle of the epoch or allow "fast epoch completion" which has to be implemented,
* keep chunk producers tracking "spare shards" ready to receive items from split shards,
* on resharding event, implement specific form of state sync, on which source and target chunk producers would agree on new state roots offline,
Expand All @@ -477,7 +521,7 @@ While it is much closer to Dynamic Resharding (below), it requires much more cha
### Positive

* The protocol is able to execute resharding even while only a fraction of nodes track the split shard.
* New resharding can happen in the matter of minutes instead of hours.
* State for new shard layouts is computed in the matter of minutes instead of hours, thus ecosystem stability during resharding is greatly increased. As before, from the point of view of NEAR users it is instantaneous.

### Neutral

Expand All @@ -489,7 +533,7 @@ N/A

### Backwards Compatibility

Approach is fully backwards compatible, just adding new protocol upgrade on top of existing implementation. Also, we were able to completely remove previous resharding logic, as it was already approved by validators, and to process chunks from any layout, it is enough to take state from that layout from archival node.
Resharding is backwards compatible with existing protocol logic.

## Unresolved Issues (Optional)

Expand Down

0 comments on commit 4e58705

Please sign in to comment.