Reorgs #162
Unanswered
typedarray
asked this question in
Idea / Feature Request
Reorgs
#162
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
This is a proposal for how Ponder will handle chain reorganizations.
TL;DR
Summarized approach
removed
logs returned byeth_getFilterChanges
to determine if a reorg that affects logs we care about has occurred.finalizationBlockCount
, and convert data from unfinalized -> finalized during this process.Key questions
eth_getFilterChanges
to detect reorgs? Do all Ethereum clients behave the same way? The documentation around removed logs is spotty.Background
Before diving in, here's some necessary context.
Blockchain data service
Ponder has a number of internal services. The blockchain data service is responsible for fetching and caching blockchain data for the contracts and filters specified in
ponder.config.ts
.During the backfill, the service uses
eth_getLogs
andeth_getBlockByHash
to fetch every event log and its associated block & transaction, then writes this data to theBlockchainDataStore
.The "frontfill" (real-time data) process is slightly different. On startup, Ponder gets the latest block for each network, and this is used as the cutoff block number separating the backfill and the frontfill. Then, an event filter is created using
eth_newFilter
and the service starts polling for new logs matching the filter usingeth_getFilterChanges
. Whenever new logs are returned, the service callseth_getBlockByHash
to fetch the block and tx for each log (just like in the backfill), and writes all this data to theBlockchainDataStore
.(Note: Ponder's other major service is the event handler service. It queries log, block, and transaction data from the
BlockchainDataStore
, constructsevent
objects, and passes them to user-defined handler functions. The handler functions then insert data into theEntityStore
, which in turn powers the GraphQL API.)Cached ranges
The
BlockchainDataStore
has a table calledLogFilterCachedRanges
with the following (psuedo) schema:Whenever the backfill or frontfill writes a log (and its associated block & transaction) to the
BlockchainDataStore
, this table gets updated to reflect the range of blocks that are now "cached" for this log filter.Problems with the current approach
eth_getFilterChanges
are finalized, which is not true. This means that whenever there is a reorg, theBlockchainDataStore
ends up out of sync with the network.eth_getBlockByNumber(blockTag: "latest")
as the cutoff block number between the backfill and the frontfill is unsafe, because it assumes all blocks up to the latest block are finalized.Proposal
New concepts
Finalized vs unfinalized blockchain data
Today, all of the tables in the
BlockchainDataStore
treat blockchain data as finalized. The proposed approach introduces a new set of tables for unfinalized data using the naming schemeUnfinalized{TableName}
. The finalized tables are just called{TableName}
.eth_getFilterChanges
behaviorQuick note on the behavior of
eth_getFilterChanges
. Log objects returned by this method have a booleanremoved
field. Most of the time,removed
is false, which means this is a new log. Ifremoved
is true, it means that this log was returned in a previouseth_getFilterChanges
poll for this filter ID, but has since been reorged out. Example:In the 3rd poll, the response contains removed logs from block
1
, and also includes a new log from the accepted block.Procedure
The proposed reorg handling approach uses removed logs to detect reorgs. Consider the following approach for a single network:
Determine the latest finalized block number. Use this as the cutoff between the backfill and the frontfill (finalization cutoff). Also determine the
finalizationBlockCount
for the network (64 on mainnet).Run the backfill for the requested block range up to (and including) the finalization cutoff. Write data to the finalized tables.
For the frontfill, create a filter (
eth_newFilter
) for each log filter, withfromBlock
equal to the finalization cutoff.Poll for changes using
eth_getFilterChanges
. If we receive a batch of logs without any removed logs, process them normally:UnfinalizedLog
record for each log.eth_getBlockByHash
(include transactions). Create anUnfinalizedBlock
record and anUnfinalizedTransaction
record for each transaction in the block that emitted a matched log. Also update theblockTimestamp
field of anyUnfinalizedLog
records created.UnfinalizedLogFilterCachedRange
record accordingly.If we receive a batch of logs with removed logs:
UnfinalizedLog
record for each removed log. Also delete theUnfinalizedBlock
andUnfinalizedTransaction
associated with each removed log (by hash).newReorganization
event.Once a batch of logs is handled, if
UnfinalizedLogFilterCachedRange.toBlock > finalizationCutoff + (2*finalizationBlockCount)
, shift the finalization cutoff:finalizationCutoff = previousFinalizationCutoff + finalizationBlockCount
.UnfinalizedLog
,UnfinalizedTransaction
, andUnfinalizedBlock
records withblockNumber < finalizationCutoff
over to their corresponding finalized table.LogFilterCachedRange.toBlock
tofinalizationCutoff
.Misc notes
Finding the current finalized block number
Some EVM networks support the
"finalized"
and"safe"
commitment levels. If these are available, useeth_getBlockByNumber("finalized")
to get the current finalized block number. If they are not available, this function could check if the chain ID is present in a manual list of known networks with slow finality (e.g. Polygon). Otherwise, we can assume instant/ single-slot finality and useeth_getBlockByNumber("latest")
.Entity store snapshots
Today, the event handler service has no mechanism for reverting the execution of logs that are later reorged out. There are two potential approaches for resolving this: 1) entity table snapshots, or 2) store all entity versions (aka time-travel). This is out of scope for this proposal, but worth mentioning because it is a source of significant complexity elsewhere in the design.
Beta Was this translation helpful? Give feedback.
All reactions