Skip to content

Commit

Permalink
docs(adr-18): remove hardcoded upgrade height (#3482)
Browse files Browse the repository at this point in the history
  • Loading branch information
rootulp authored May 15, 2024
1 parent 06b4c96 commit a251b91
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion docs/architecture/adr-018-network-upgrades.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Proposed
- 2023/03/29: Initial draft
- 2023/08/29: Modified to reflect the proposed decision and the detailed design
- 2023/10/14: Update ADR to reflect changes in implementation
- 2024/05/14: Update ADR to reflect the lack of hardcoded upgrade heights

## Context

Expand Down Expand Up @@ -44,7 +45,7 @@ Given this, a node can at any time spin up a v2 binary which will immediately be

### Configured Upgrade Height

The height of the upgrades will initially be hard coded into the binary. This will consist of a mapping from chain ID to app version to a range of heights that will be loaded by the application into working memory whenever the node begins and supplied directly to the `upgrades` module which will be responsible for scheduling. The chainID is required as we expect the same binary to be used across testnets and mainnet. There are a few considerations that shape how this system will work:
The height of the the v1 -> v2 upgrade will initially be supplied via CLI flag (i.e. `--v2-upgrade-height`). There are a few considerations that shape how this system will work:

- Upgrading needs to support state migrations. These must happen to all nodes at the same moment between heights. Ideally all migrations that affect state would correspond at the height of the new app version i.e. after `Commit` and before processing of the transactions at that height. `BeginBlock` seems like an ideal area to perform these upgrades however these might affect the way that `PrepareProposal` and `ProcessProposal` is conducted thus they must be performed even prior to these ABCI calls. A simpler implementation would have been for the proposer to immediately propose a block with the next version i.e. v2. However that would require the proposer to first migrate state (taking an unknown length of time) and for the validators receiving that proposal to first migrate before validating and given that the upgrade is not certain, there would need to be a mechanism to migrate back to v1 (NOTE: this remains the case if we wish to support downgrading which is discussed later). To overcome these requirements, the proposer must signal in the prior height the intention to upgrade to a new version. This is done with a new message type, `MsgVersionChange`, which must be put as the first transaction in the block. Validators read this and if they are in agreement to supporting the version change they vote on the block accordingly. If the block reaches consensus then all validators will update the app version at `EndBlock`. CometBFT will then propose the next block using that version. Nodes that have not upgraded and don't support the binary will error and exit. Given that the previous block was approved by more than 2/3 of the network we have a strong guarantee that this block will be accepted by the network. However, it's worth noting that given a security model that must withstand 1/3 byzantine nodes, even a single byzantine node that voted for the upgrade yet doesn't vote for the following block can stall the network until > 2/3 nodes upgrade and vote on the following block.
- Given uncertainty in scheduling, the system must be able to handle changes to the upgrade height that most commonly would come in the form of delays. Embedding the upgrade schedule in the binary is convenient for node operators and avoids the possibility for user errors. However, binaries are static. If the community wished to push back the upgrade by two weeks there is the possibility that some nodes would not rerun the new binary thus we'd get a split between nodes running the old schedule and nodes running the new schedule. To overcome this, proposers will only propose a version change in the first round of each height, thus allowing transactions to still be committed even under circumstances where there is no consensus on upgrading. Secondly, we define a range in which nodes will attempt to upgrade the app version and failing this will continue to run the current version. Lastly, the binary will have the ability to manually specify the app version height mapping and override the built-in values either through a flag or in the `app.toml` config. This is expected to be used in testing and in emergency situations only. Another example to keep in mind is if a quorum outright rejects an upgrade. If some of the validators are for the change they should have some way to continue participating in the network. Therefore we employ a range that nodes will attempt to upgrade and afterwards will continue on normally with the new binary however running the older version.
Expand Down

0 comments on commit a251b91

Please sign in to comment.