Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[execution window] blocks support a window, along with block storage and sync #14076

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

bchocho
Copy link
Contributor

@bchocho bchocho commented Jul 22, 2024

Description

The main purpose of this PR is to enhance block retrieval and pipeline execution by adding a window of blocks (OrderedBlockWindow) to the PipelinedBlock. It also refines the block retrieval mechanism by introducing target_epoch_and_round fields for better compatibility and logging improvements.

The recent changes in this PR are:

  1. consensus/consensus-types/src/block_retrieval.rs:

    • Added target_epoch_and_round field and corresponding methods to BlockRetrievalRequest.
      Updated BlockRetrievalResponse to use target_epoch_and_round instead of target_block_id.
  2. consensus/consensus-types/src/pipelined_block.rs:

    • Introduced OrderedBlockWindow struct.
    • Added block_window field to PipelinedBlock.
    • Updated serialization and deserialization methods to include block_window.
    • Added logging for block creation.

Test Plan

  • Includes Unit Tests for testing the behavior of the BlockStore and block_window during pruning and other operations.
  • Includes swarm smoke tests for execution window for OnChainConsensusConfig

Note: For testing, BlockTree exposes a prune_tree functionality which mimics some, but not all, of the behavior in BlockTree::commit_callback. To further illustrate the difference, I've provided a visual demonstrating the difference between the two in a happy path scenario. Excalidraw available here

commit_callback

Untitled-2024-02-01-1426

Type of Change

  • New feature
  • Bug fix
  • Breaking change
  • Performance improvement
  • Refactoring
  • Dependency update
  • Documentation update
  • Tests

Which Components or Systems Does This Change Impact?

  • Validator Node
  • Full Node (API, Indexer, etc.)
  • Move/Aptos Virtual Machine
  • Aptos Framework
  • Aptos CLI/SDK
  • Developer Infrastructure
  • Other (specify)

How Has This Been Tested?

Key Areas to Review

Checklist

  • I have read and followed the CONTRIBUTING doc
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I identified and added all stakeholders and component owners affected by this change as reviewers
  • I tested both happy and unhappy path of the functionality
  • I have made corresponding changes to the documentation

Copy link

trunk-io bot commented Jul 22, 2024

⏱️ 13m total CI duration on this PR

Job Cumulative Duration Recent Runs
forge-e2e-test / forge 13m 🟥

settingsfeedbackdocs ⋅ learn more about trunk.io

@bchocho bchocho force-pushed the brian/exec-window-base branch from 1aa6615 to f4d9e6a Compare July 22, 2024 22:15
@bchocho bchocho changed the title in progress [execution window] block storage and sync support a window Jul 23, 2024
@bchocho bchocho force-pushed the brian/exec-window-base branch from f4d9e6a to 900d4d3 Compare July 24, 2024 07:51
@bchocho bchocho force-pushed the brian/exec-window-base branch from 900d4d3 to 8cf4526 Compare August 2, 2024 17:30
@bchocho bchocho changed the title [execution window] block storage and sync support a window [execution window] blocks support a window, along with block storage and sync Aug 2, 2024
@bchocho bchocho force-pushed the brian/exec-window-base branch from 8cf4526 to 4301028 Compare August 2, 2024 21:58
@bchocho bchocho added the CICD:run-forge-e2e-perf Run the e2e perf forge only label Aug 2, 2024

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-base branch from 3bf2da1 to bd733dd Compare August 19, 2024 21:55

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-base branch from a17b775 to be71569 Compare September 25, 2024 17:28

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-base branch from be71569 to 7a3647b Compare October 15, 2024 18:54

This comment has been minimized.

This comment has been minimized.

@bchocho bchocho force-pushed the brian/exec-window-base branch from 7a3647b to c1891f5 Compare October 30, 2024 19:44
@hariria hariria force-pushed the brian/exec-window-base branch from 6c62b13 to d9fa122 Compare December 5, 2024 19:26

This comment has been minimized.

This comment has been minimized.

@hariria hariria marked this pull request as ready for review December 6, 2024 00:17
Comment on lines 257 to 284
BlockRetrievalRequest:
ENUM:
0:
V1:
NEWTYPE:
TYPENAME: BlockRetrievalRequestV1
1:
V2:
NEWTYPE:
TYPENAME: BlockRetrievalRequestV2
BlockRetrievalRequestV1:
STRUCT:
- block_id:
TYPENAME: HashValue
- num_blocks: U64
- target_block_id:
OPTION:
TYPENAME: HashValue
BlockRetrievalRequestV2:
STRUCT:
- block_id:
TYPENAME: HashValue
- num_blocks: U64
- target_epoch_and_round:
OPTION:
TUPLEARRAY:
CONTENT: U64
SIZE: 2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO Double check if this is OK

@hariria hariria force-pushed the brian/exec-window-base branch from d9fa122 to 5579de6 Compare December 6, 2024 22:14

This comment has been minimized.

This comment has been minimized.

Copy link
Contributor

@ibalajiarun ibalajiarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first pass. I will take another pass.

alg: ConsensusAlgorithmConfig,
vtxn: ValidatorTxnConfig,
// Execution pool block window
window_size: usize,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be part of OnchainExecutionConfig?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there's a semantic argument for whether it should be in the OnChainConsensusConfig or the OnchainExecutionConfig. Ultimately the BlockStore needs to get access to the window_size, and the BlockStore is used in Consensus. So putting it in OnChainConsensusConfig makes sense.

Cc @bchocho

let root_block = blocks.remove(root_idx);
.position(|block| block.id() == latest_commit_id)
.ok_or_else(|| format_err!("unable to find root: {}", latest_commit_id))?;
let commit_block = blocks[latest_commit_idx].clone();
let root_quorum_cert = quorum_certs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let root_quorum_cert = quorum_certs
let commit_blk_quorum_cert = quorum_certs

Comment on lines +176 to +179
let mut id_to_blocks = HashMap::new();
blocks.iter().for_each(|block| {
id_to_blocks.insert(block.id(), block);
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let mut id_to_blocks = HashMap::new();
blocks.iter().for_each(|block| {
id_to_blocks.insert(block.id(), block);
});
let id_to_blocks: HashMap<_, _> = blocks.iter().map(|block| (block.id(), block)).collect();

root_commit_cert,
))

let window_start_round = blocks[latest_commit_idx]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This round may not necessarily exist right? Probably timed out. Also, some blocks between can be missing too. So, this may not be the actual start round?

Do we calculate the window based on consecutive round numbers? or do we calculate based on successful blocks?

@@ -576,22 +629,23 @@ impl BlockRetriever {
.boxed(),
)
}
let request = BlockRetrievalRequest::new_with_target_block_id(
let request = BlockRetrievalRequest::new_with_target_round(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot make this change until we have rolled out the new BlockRetievalRequestV2 handling logic in a release. This should go in a future release after that one.

bchocho and others added 7 commits January 30, 2025 10:16
…store and persistent_liveness_storage.

Also adds OrderedBlockWindow into PipelinedBlock.
Actual execution of the block will be added to a separate PR.
## Description
This PR includes a set of unit tests and test utilities for the execution window changes covered in [`brian/exec-window-base`](#14076)
…date consensus.yaml for target_block_and_round
@hariria hariria force-pushed the brian/exec-window-base branch from 5579de6 to 375c1d4 Compare January 31, 2025 00:18

This comment has been minimized.

This comment has been minimized.

@hariria hariria force-pushed the brian/exec-window-base branch from 375c1d4 to 5132e5c Compare January 31, 2025 21:00

This comment has been minimized.

This comment has been minimized.

@hariria hariria force-pushed the brian/exec-window-base branch from 5132e5c to aad1f51 Compare February 2, 2025 02:05

This comment has been minimized.

Copy link
Contributor

github-actions bot commented Feb 2, 2025

✅ Forge suite realistic_env_max_load success on aad1f510fef11d9221125286fbe3d2548c419ac5

two traffics test: inner traffic : committed: 14740.68 txn/s, latency: 2687.58 ms, (p50: 2700 ms, p70: 2700, p90: 3000 ms, p99: 3300 ms), latency samples: 5604780
two traffics test : committed: 100.00 txn/s, latency: 1451.29 ms, (p50: 1400 ms, p70: 1500, p90: 1600 ms, p99: 2100 ms), latency samples: 1780
Latency breakdown for phase 0: ["MempoolToBlockCreation: max: 1.427, avg: 1.396", "ConsensusProposalToOrdered: max: 0.292, avg: 0.290", "ConsensusOrderedToCommit: max: 0.432, avg: 0.413", "ConsensusProposalToCommit: max: 0.723, avg: 0.704"]
Max non-epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.85s no progress at version 28264 (avg 0.19s) [limit 15].
Max epoch-change gap was: 0 rounds at version 0 (avg 0.00) [limit 4], 0.65s no progress at version 2617958 (avg 0.65s) [limit 16].
Test Ok

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CICD:run-forge-e2e-perf Run the e2e perf forge only
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants