Skip to content

Fork Handling

Ivano Pagano edited this page Oct 19, 2020 · 4 revisions

Design Notes

Fork Detection

We plan to scan the entire chain (~10^6) with a binary search - which should take ~20 steps in the worst case. This allow us to track the first matching block hash where a fork eventually happened.

We define a type: BlockIdSearch = Level => BlockHash, and therefore depend on the couple of services

  • Indexer BlockIdSearch
  • Node BlockIdSearch

Trigger: at each cycle start, we run such a search to verify if the node and Conseil agree on the hash of the locally indexed (conseil) head block’s level.

A ForkDetector is built from those same search functions to run the binary scan algorithm that finds the lowest Level where local and remote searches yield different block-id values.

Fork audit log

We add a table (forks) that will log the detection results with the following data:

  • fork_id - a unique identifier for this specific fork being detected
  • fork_level - at which the fork was detected
  • fork_hash - the block hash which was previously stored at the forked level
  • head_level - reached by conseil when the fork was detected
  • timestamp - when the record is added (corresponds to the local clock time when detection happened)

Fork Correction

We introduce an additional column, named “invalidated_asof” in each table that could be impacted by the fork, i.e. every table referring to invalidated blocks/levels. The column will contain a timestamp of the local clock time when the detection happened, which would correspond to the newly added fork table record. By default the column holds null.

This would not avoid duplication of entries if the fork should “re-validate” any invalidated block at the next fetch cycle. That is, if the chain is bi-stable, the resulting data values might “oscillate” between two alternatives, along two corresponding lines of evolution for the chain.

That would mean that on each fetch cycle, previously invalidated blocks (for example) might be downloaded again and the same primary key would be already stored in the table, failing the transaction. Same goes for any block-derived entity, which was invalidated and then re-stored.

Therefore, as a safety precaution, we add the fork-id of the detected fork, as stored in the forks audit table, as part of the unique primary key, and add the extra column to every table that allows invalidation. This column should be primed by default to some forks identifier which is always available, corresponding to an “official fork”. This is necessary for data integrity, as a primary column can’t be nullified. We use leader as such identifier for the official running "branch".

When the detection happens, a process will identify all records referencing block levels higher than the fork-level and mark the invalidated_asof column with the current time. Additionally, it will update the fork-id FK to the newly detected fork id.

Fork Handling

The overall handling will include

  1. Trigger the fork detection on the latest conseil block
  2. Detect the fork level [possibly]
  3. Add an entry to the fork table and
  4. Update all records referring to the fork level and up, setting the invalid_asof and the new fork_id

After that, Lorre indexing will resume as usual, simply ignoring the invalidated records. This would consistently rebuild any missing data from the fork on.

Additional corrections

The handling process will perform extra custom compensating actions to "reset" some of the indexer state, as in:

  • removing global events to be processed (like chain-wide update of accounts)
  • reset in-memory state for those events

Further compensating actions are yet to be defined for getting big-maps in a consistent state.

Fork Querying

All the code that creates queries for the impacted tables now includes an optional parameter that will change the query results to accordingly include or not the records with non-null “invalidated_asof” column values. That is, those entries that are not part of the currently official branch.

Any “regular” conseil query, would usually run with the default flag, thus ignoring any fork data.

We would provide additional user-accessible endpoints to expose the additional invalidation information, to make the situation “inspectable” from the API clients.

This would include a view of the forks table itself, which will enable verification and corrective actions to be taken after a fork event.

Implementation Notes

Here is outlined the implementation of the design solution, trying to optimize for simplicity and, when needed, performance.

The first thing to consider is that the abstract design doesn’t realistically consider the possible failures and indeterministic nature of the operations at hand.
We will now add an Eff[T] layer for results that might exhibit such behaviours, and generally assume that we will handle said effects with common-sense handlers, especially taking in consideration failures, be they system-related failure or other contingencies, like timeouts or temporary unavailability of a service.

This will change the signatures as shown here

BlockIdSearch   = Level => Eff[Block Hash]

// from a verification range to the forked block's level
// the starting range bounds are genesis level to current indexer tip block
ForkDetector = LevelRange => Eff[Level] 

//from forked level and block-id to the state updates
ForkAmender = (Level, Block Hash) => Eff[Amendments]

Testing the process

We define abstract interfaces to describe the steps, gaining a simple way to mock the external services, providing a minimal set of functionality and data as strictly required from the individual tests. The goal here was to simplify the test scenarios, which would not require the fully exploded information on each step of the process being considered.

How to store the data and perform the actions

The search and index functions are implemented as internal queries based on the local database, or as remote queries to the available tezos node. In consideration of this detail, the actual implementation (for Tezos) uses a temporary in-memory storage, of fairly basic complexity, whose goal is to keep in memory the remote node responses that might be re-used between the different steps and within the same.

E.g. we would search remote hashes by level, and get back the whole block data json payload from the node endpoint. We want to store the extracted data of interest and re-use it for hash comparison, and for later reuse in the amendment step to get the deserialized data, for any level that will be considered as forked.

As mentioned, we couldn't address here any such data that is actually computed as a progressive process, during the regular chain synchronization steps.
Some of this data might require a form of later correction or evaluation, based on the invalidation/validation of new data. We need to evaluate in more detail such secondary impact on the indexed data.

Example use case: custom chain-wide event handling

When handling the babylon airdrop to all accounts, we implemented a chain-wide event handler system, that will trigger custom code upon reaching a chain block-level. One use is that, at the levels shortly after the babylon upgrade, we reload the status of any account stored in the chain from the node itself, to correct the now-incorrect balances.

This event-system is impacted by any fork happening (not for the babylon specifically, but for future events). A way to handle it is to remove the record keeping track of past processed events (in its dedicated table) for levels higher than the fork point. Having done that, the main indexing loop, which keeps track of those in a in-memory map, will simply need to restore the event-processed records from the database before the next loop, to have a consistent view of the events unprocessed yet.

Arising Issues

[In progress]

  • tezos_names, the table doesn’t refer the block, yet values might be invalidated nonetheless
  • BigMap-related tables (big_maps, big_map_contents, originated_account_maps) have no reference to the level, but would be invalidated nonetheless

Detector shortcomings for the simplest implementation

A reasonable question is, what happens if the fork detector algorithm tries to locate a moving target?

Suppose the node we connect to undergoes multiple fork jumps even during the detection: every time the detector asks to compare a remote hash for given levels with the local stored hashes, the actual fork of the remote might be different.

Expected Scenario

genesis                           fork     local head
|                                  |           |
------------------------------------------------------------> lvls
                        |       | |^|  |        
                        1       3 5 4  2         

With binary search more or less we move as shown in the diagram.

Every position marked on the levels needs to ask the remote node for a hash to compare with the local to identify if we need to search higher/lower or if we found the match.

This assumes the remote to be in a state of equilibrium or to change so slowly as to have no impact during the detection process. If the node is not in equilibrium the values returned might represent different states (especially during short-lived seesaw patterns).

We might consider:

  1. evaluating the time-scale for fork oscillation on the node (based on the underlying block baking scale and consensus propagation) and comparing that to the mean time of detection calls (a network IO request over HTTP can't essentially be guaranteed to be bound). This could produce an estimated percentage of the chance of having a stable detection during forks, which essentially is not solving the problem but managing risks.
  2. adding an extra verification step once the fork level is detected. Hereby we mean to simply verify that the predecessor of the newly forked block detected matches the locally stored hash of the corresponding level. We only need this to ensure that we're keeping the local chain "consistent" throughout its length. Even if the chain should jump to a new fork in the following fetch cycles, the process would simply rerun and fix the current situation.

We eventually went for option 2.

As long as we make sure that the chain is locally consistent with no interruption, we can always track back a fork point in future detection/correction rounds.

We need to avoid having chunks of inconsistent blocks before the future fork point, because our detection process won't go as far as verify the whole length, but assume that when the fork level is found, everything before is ok, and everything after is invalidated.

Consider this scenario, based on the previous diagram:

  • We'll call the local block-chain branch fork#base
  • The node suddenly experiences a fork and we identify that fork as fork#alt, whose fork level is marked in the diagram.
  • The local head differs from the remote hash, the binary search is triggered to find the fork level
  • point#1 on the diagram will give matching hashes for the local and remote hashes
  • The remote node changes branch once again, back to fork#base
  • When checking point#2 the detection will find matching blocks and will not go to point#3 but to point#3b marked below
  • The remote node once more jumps to fork#alt
  • Since the fork is now before point#2, the binary search will always move further towards it, eventually (incorrectly) identifying the level immediately following point#2 as the fork point.

This is how it would look like

Failing Scenario

genesis                           fork     local head
|                                  |           |
------------------------------------------------------------> lvls
                        |       - -^-  ||| |    
                        1       3 5 4  2   3b

Given this situation, if the correction should proceed from here, we'll be losing all corrections between levels fork and point#2, and they would never again be evaluated, most probably, or would create issues to any successive fork detection checking that chain range. Instead, an additional check on the predecessor hash of the remote block at level point#2 + 1 should show that it doesn't match with the locally stored hash at point#2 hence invalidating the detection.

We could then rerun the detection process until it finds a fork level whose predecessor matches locally, too. This would at least guarantee that up to that fork point, no invalid block is present, that the chain sequence is consistent along its predecessors up to the genesis.

We use this kind of double-check in the ConsistentForkDetector implementation.