Skip to content

Commit

Permalink
reference implementation description and minor nits
Browse files Browse the repository at this point in the history
  • Loading branch information
wacban authored Oct 27, 2023
1 parent 051e5aa commit c0ae86d
Showing 1 changed file with 52 additions and 10 deletions.
62 changes: 52 additions & 10 deletions neps/nep-0508.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,27 +12,27 @@ LastUpdated: 2023-09-19

## Summary

In essence, this NEP is an extension of [NEP-40](https://github.com/near/NEPs/blob/master/specs/Proposals/0040-split-states.md), which focused splitting one shard into multiple shards.
This proposal introduces a new implementation for resharding and a new shard layout for the production networks.

We are introducing resharding v2, which supports one shard splitting into two within one epoch at pre-determined split boundary. The NEP includes performance improvement to make resharding feasible under the current state as well as actual resharding in mainnet and testnet (To be specific, spliting shard 3 into two).

While the new approach addresses critical limitations left unsolved in NEP-40 and is expected to remain valid for foreseable future, it does not serve all usecases, such as dynamic resharding.
In essence, this NEP is an extension of [NEP-40](https://github.com/near/NEPs/blob/master/specs/Proposals/0040-split-states.md), which was focused on splitting one shard into multiple shards.

We are introducing resharding v2, which supports one shard splitting into two within one epoch at a pre-determined split boundary. The NEP includes performance improvement to make resharding feasible under the current state as well as actual resharding in mainnet and testnet (To be specific, spliting shard 3 into two).

While the new approach addresses critical limitations left unsolved in NEP-40 and is expected to remain valid for foreseable future, it does not serve all usecases, such as dynamic resharding.

## Motivation

Currently, NEAR protocol has four shards. With more partners onboarding, we started seeing that some shards occasionally become over-crowded. In addition, with state sync and stateless validation, validators do not need to track all shards and validator hardware requirements can be greatly reduced with smaller shard size.
Currently, NEAR protocol has four shards. With more partners onboarding, we started seeing that some shards occasionally become over-crowded. In addition, with state sync and stateless validation, validators will not need to track all shards and validator hardware requirements can be greatly reduced with smaller shard size.

## Specification

### High level assumptions

* Some form of State sync (centralized or decentralized) is enabled.
* Flat state is enabled.
* Shard split boundary is predetermined. In other words, necessity of shard splitting is manually decided.
* Merkle Patricia Trie is the undelying data structure for the protocol state.
* Minimal epoch gap between two resharding events is X.
* Some form of State Sync (centralized or decentralized) is enabled.

### High level requirements

Expand All @@ -42,12 +42,12 @@ Currently, NEAR protocol has four shards. With more partners onboarding, we star
* ~~Resharding should not require additional hardware from nodes.~~
* This needs to be assessed during test
* Resharding should be fault tolerant
* Chain must not stall in case of resharding failure.
* Chain must not stall in case of resharding failure. TODO - this seems impossible under current assumptions because the shard layout for an epoch is committed to the chain before resharding is fininished
* A validator should be able to recover in case they go offline during resharding.
* For now, our aim is at least allowing a validator to join back after resharding is finished.
* No transaction or receipt should be lost during resharding.
* Resharding should work regardless of number of existing shards.
* There should be no more place (in any apps or tools) where number of shard is hardcoded.
* There should be no more place (in any apps or tools) where the number of shards is hardcoded.

### Out of scope

Expand All @@ -57,7 +57,7 @@ Currently, NEAR protocol has four shards. With more partners onboarding, we star
* merging shards
* shard reshuffling
* shard boundary adjustment
* Shard determination logic (shard boundary is still determined by string value)
* Shard Layout determination logic (shard boundaries are still determined offline and hardcoded)
* Advanced failure handling
* If a validator goes offline during resharding, it can join back immediately and move forward as long as enough time is left to reperform resharding.
* TBD
Expand All @@ -66,17 +66,51 @@ Currently, NEAR protocol has four shards. With more partners onboarding, we star

TBD. e.g. configuration changes we have to introduce

A new protocol version will be introduced specifying the new shard layout.

### Required state changes

TBD. e.g. additional/updated data a node has to maintain

* For the duration of the resharding the node will need to maintain a snapshot of the flat state and related columns.
* For the duration of the epoch before the new shard layout takes effect, the node will need to maintain the state and flat state of shards in the old and new layout at the same time.

### Resharding flow

TBD. how resharding happens at the high level

* The new shard layout will be agreed on offline by the protocol team and hardcoded in the neard reference implementation.
* In epoch T the protocol version upgrade date will pass and nodes will vote to switch to the new protocol version. The new protocol version will contain the new shard layout.
* In epoch T, in the last block of the epoch, the EpochConfig for epoch T+2 will be set. The EpochConfig for epoch T+2 will have the new shard layout.
* In epoch T + 1, all nodes will perform the state split. The child shards will be kept up to date with the blockchain up until the epoch end.
* In epoch T + 2, the chain will switch to the new shard layout.

## Reference Implementation

TBD
The implementation heavily re-uses the implementation from [NEP-40](https://github.com/near/NEPs/blob/master/specs/Proposals/0040-split-states.md). Below are listed only the major differences and additions.

### Flat Storage
The old implementaion of resharding relied on iterating over the full state of the parent shard in order to build the state for the children shards. This implementation was suitable at the time but since then the state has grown considerably and this implementation is now too slow to fit within a single epoch. The new implementation relies on the flat storage in order to build the children shards quicker. Based on benchmarks, splitting one shard by using flat storage can take up to 15min.

The new implementation will also propagate the flat storage for the children shards and keep it up to the with the chain until the switch to the new shard layout. The old implementation didn't handle this case because the flat storage didn't exist back then.

In order to ensure consistent view of the flat storage while splitting the state the node will maintain a snapshot of the flat state and related columns. The existing implementation of flat state snapshots used in State Sync will be adjusted for this purpose.

### Handling receipts, gas burnt and balance burnt

When resharding, extra care should be taken when handling receipts in order to ensure that no receipts are lost or duplicated. The gas burnt and balance burnt also need to be correclty handled. The old resharding implementation for handling receipts, gas burnt and balance burnt relied on the fact in the first resharding there was only a single parent shard to begin with. The new implementation will provide a more generic and robust way of reassigning the receipts, gas burnt and balance burnt that works for arbitrary splitting of shards, regardless of the previous shard layout.

### New shard layout

A new shard layout will be determined and will be scheduled and executed in the production networks. The new shard layout will maintain the same boundaries for shards 0, 1 and 2. The heaviest shard today - Shard 3 will be split by introducing a new boundary account. The new boundary account will be determined by analysis the storage and gas usage within the shard and selecting a point that will divide the shard roughly in half in accordance to the mentioned metrics. Other metrics can also be used.

### Fixed shards

Fixed shards is a feature of the protocol that allows for assigning specific accounts and all of their recursive sub accounts to a predetermined shard. This feature is only used for testing, it was never used in production and there is no need for it in production. This feature unfortunately breaks the contiguity of shards. A sub account of a fixed shard account can fall in the middle of account range that belongs to a different shard. This property of fixed shards makes it particularly hard to reason about and implement efficient resharding. In order to simplify the code and new resharding implementation the fixed shards feature was removed ahead of this NEP.

### Transaction pool

The transaction pool is sharded e.i. it groups transactions by the shard where each should be converted to a receipt. The transaction pool was previously sharded by the ShardId. Unfortunately ShardId is insufficient to correctly identify a shard across a resharding event as ShardIds change domain. The transaction pool was migrated to group transactions by ShardUId instead and a transaction pool resharding was implemented to reassign transaction from parent shard to children shards right before the new shard layout takes effect. This was implemented ahead of this NEP.

## Security Implications

Expand All @@ -91,6 +125,14 @@ TBD
* What is the impact of not doing this?
* TBD

## Integration with State Sync

TBD

## Integration with Stateless Validation

TBD

## Future possibilities

As noted above, dynamic resharding is out of scope for this NEP and should be implemented in the future. Dynamic resharding includes the following but not limited to:
Expand Down

0 comments on commit c0ae86d

Please sign in to comment.