Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: block.timestamp is not accurate #3398

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 25 additions & 16 deletions core/node/state_keeper/src/keeper.rs
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ pub struct ZkSyncStateKeeper {
sealer: Arc<dyn ConditionalSealer>,
storage_factory: Arc<dyn ReadStorageFactory>,
health_updater: HealthUpdater,
should_create_l2_block: bool,
}

impl ZkSyncStateKeeper {
Expand All @@ -89,6 +90,7 @@ impl ZkSyncStateKeeper {
sealer,
storage_factory,
health_updater: ReactiveHealthCheck::new("state_keeper").1,
should_create_l2_block: false,
Copy link
Member Author

@thomas-nguy thomas-nguy Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if we should persist this in rocksdb to prevent issues at restart?

}
}

Expand Down Expand Up @@ -187,7 +189,10 @@ impl ZkSyncStateKeeper {

// Finish current batch.
if !updates_manager.l2_block.executed_transactions.is_empty() {
self.seal_l2_block(&updates_manager).await?;
if !self.should_create_l2_block {
// l2 block has been already sealed
self.seal_l2_block(&updates_manager).await?;
}
Comment on lines +192 to +195
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This place is confusing with the proposed changes. The check above checks whether the latest block contains any transactions. AFAIU, if should_create_l2_block is true, then the check doesn't concern the latest block, but rather the previous one; the latest block isn't really started yet. So, a fictive block must be started in any case. IIUC, the current approach technically works because the previous block exists and is non-empty, but it looks hacky.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment below.

// We've sealed the L2 block that we had, but we still need to set up the timestamp
// for the fictive L2 block.
let new_l2_block_params = self
Expand All @@ -199,6 +204,7 @@ impl ZkSyncStateKeeper {
&mut *batch_executor,
)
.await?;
self.should_create_l2_block = false;
}

let (finished_batch, _) = batch_executor.finish_batch().await?;
Expand Down Expand Up @@ -585,14 +591,30 @@ impl ZkSyncStateKeeper {
return Ok(());
}

if self.io.should_seal_l2_block(updates_manager) {
if !self.should_create_l2_block && self.io.should_seal_l2_block(updates_manager) {
tracing::debug!(
"L2 block #{} (L1 batch #{}) should be sealed as per sealing rules",
updates_manager.l2_block.number,
updates_manager.l1_batch.number
);
self.seal_l2_block(updates_manager).await?;
self.should_create_l2_block = true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand the purpose of this variable. AFAIU, the logic here conceptually should change as follows:

  • After sealing the block, do not start the next block immediately; instead, set a local var whether to start it.
  • Wait for the next transaction.
  • After receiving a transaction, if the flag is set, start a new block and unset the flag.

The logic here almost follows this flow, but the flag is non-local, which complicates reasoning.

Copy link
Member Author

@thomas-nguy thomas-nguy Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, this could totally be the flag local!

The reason I have it global is because I need to know the state of the last fictive block when closing the batch in the parent loop (parent function).

  • Case 1 : The last block has not been sealed. This is the original behavior before the PR change because we are always creating new unsealed block right after sealing one wether we receive new transaction or not. This is why we are "always" sealing the last block in the parent loop before closing the batch.

  • Case 2: The last block has been sealed, but no transaction has been received for some period of time and ultimately we are "forced" to close the batch. In that case we are in a weird state where the last block has been sealed but no new block has started and we should not seal the last block again in the parent loop

It seems a bit hacky indeed but it was the best way I found in order to not introduce too much change in this PR.

Perhaps I can completely remove the sealing logic in the parent loop so that we won't need global flag and turn this into a local flag? This could be much easier to understand and yes, the flow you are describing is exactly what the PR is trying to do.

}
let waiting_latency = KEEPER_METRICS.waiting_for_tx.start();
let Some(tx) = self
.io
.wait_for_next_tx(POLL_WAIT_DURATION, updates_manager.l2_block.timestamp)
.instrument(info_span!("wait_for_next_tx"))
.await
.context("error waiting for next transaction")?
else {
waiting_latency.observe();
continue;
};
waiting_latency.observe();
let tx_hash = tx.hash();

if self.should_create_l2_block {
let new_l2_block_params = self
.wait_for_new_l2_block_params(updates_manager, stop_receiver)
.await
Expand All @@ -605,22 +627,9 @@ impl ZkSyncStateKeeper {
);
Self::start_next_l2_block(new_l2_block_params, updates_manager, batch_executor)
.await?;
self.should_create_l2_block = false;
}
let waiting_latency = KEEPER_METRICS.waiting_for_tx.start();
let Some(tx) = self
.io
.wait_for_next_tx(POLL_WAIT_DURATION, updates_manager.l2_block.timestamp)
.instrument(info_span!("wait_for_next_tx"))
.await
.context("error waiting for next transaction")?
else {
waiting_latency.observe();
tracing::trace!("No new transactions. Waiting!");
continue;
};
waiting_latency.observe();

let tx_hash = tx.hash();
let (seal_resolution, exec_result) = self
.process_one_tx(batch_executor, updates_manager, tx.clone())
.await?;
Expand Down
Loading