[Feature] Multi chain rewind #2620

yoozo · 2024-12-05T02:57:57Z

Description

In a multi-chain project, after enabling historical mode, a rewind requires coordination with other processes for rollback. Below are the tasks we need to accomplish.

Scene

Caused by a fork in the chain
Project migrate
Nodes only need to roll back if they have indexed > the timestamp of the rollback.

TODO

cold start requires rollback execution.
A pause mechanism is needed where all fetch processes stop in case of a rollback.
A logger notification is needed when the API fetch is paused.
Before performing the write, check if _global.rewind_lock exist. If they do, perform a rollback.
When starting the process, check if rewind_timestamp exists. If it does, perform a rewind.
Check if _global contain the rewind_timestamp. If none exist, release _global.rewind_lock.
Retrieve the corresponding block height to roll back to using a timestamp, It may require using the binary search method.

Pending matters

The RPC does not support querying block height by timestamp, but our current POI and dynamicDs rely on block height for rollback.
Only one node needs to rewind the data, but each node needs to rewind its own metadata (to find the block for that timestamp)

Solution

The pause mechanism can be implemented using _global.rewind_lock, with priority locking during a rewind. Simultaneously write a rewind_timestamp record to _global, indicating the height that needs to be rolled back.

_global table ensure that each chain will execute rewind.
rewind_timestamp ensure that the API fetch is paused until all chains have completed the rewind.

CREATE TABLE "_global" (
  "key" varchar(255) NOT NULL,
  "value" jsonb,
  "createdAt" timestamptz(6) NOT NULL,
  "updatedAt" timestamptz(6) NOT NULL,
  CONSTRAINT "_global_pkey" PRIMARY KEY ("key")
);

When a rollback is required, the data in the global table looks like this.

Key	Value
rewind_lock	$timestamp
rewind_timestamp_ethereum	$timestamp
rewind_timestamp_base	$timestamp

issues

How does the rewind lock get lifted?

After each process completes its reindex, it needs to check if there are other rewind tasks in _global. If none exist, it releases _global.rewind_lock.

Is there polling to check the metadata tables?

No, but when fetching through the API, it is necessary to check if _global.rewind_lock exists. If it does, set an interval to check whether the rewind has completed, and log messages to let the user know that they need to wait for other chains to complete the rewind.

What if the node that requested the rewind restarts within that time?

There is no impact, after restarting, rewind will be executed again because the rewind_timestamp record is still there.

What happens if multiple nodes create an advisory lock simultaneously?

The key field in the _global table is unique, and it ensures that only one node can write successfully.

The text was updated successfully, but these errors were encountered:

yoozo mentioned this issue Dec 12, 2024

[Draft] multi chain rewind #2627

Draft

25 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Multi chain rewind #2620

[Feature] Multi chain rewind #2620

yoozo commented Dec 5, 2024 •

edited

Loading

[Feature] Multi chain rewind #2620

[Feature] Multi chain rewind #2620

Comments

yoozo commented Dec 5, 2024 • edited Loading

Description

Scene

TODO

Pending matters

Solution

issues

yoozo commented Dec 5, 2024 •

edited

Loading