-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplified block storage #53
Conversation
pompon0
commented
Dec 19, 2023
•
edited
Loading
edited
- simplified the storage traits, by moving the inmem caching implementation to era-consensus
- extracted PayloadManager trait responsible for proposing/verifying payloads.
- clarified the expected persistence guarantees of the PersistentBlockStore trait
- simplified the block caching to maintain a continuous range of blocks
- simplified the sync_blocks implementation as a result of previous simplifications
- moved rocksdb storage implementation to tools crate, given that the executor binary is there anyway.
- Removed genesis block from config to simplify configuration
- Removed redundant header from FinalBlock.
- Made bft actor wait for past blocks to be stored before calling propose/verify to more accurately reflect the expected implementation of a stateful blockchain (and to simplify implementation of the PayloadManager trait)
node/libs/storage/src/block_store.rs
Outdated
/// Since cache is a continuous range of blocks, `queue_block()` | ||
/// synchronously waits until all intermediate blocks are added | ||
/// to cache AND cache is not full. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, these constraints make it difficult to reason about store usage correctness; e.g., it's moderately easy to deadlock it by queuing blocks out of order. A somewhat artificial example: in SyncBlocks
actor, if the number of concurrently downloaded blocks is more than cache capacity and the corresponding permits were (incorrectly) dropped too late, after queuing the received block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify your example? As I understand it:
- N = download permits in SyncBlocks actor
- M = max cache size
- N > M
- first N blocks are being downloaded in parallel
- (what happens here?)
I can remove the block cache size by replacing queue_block with store_block (i.e. make fetching task wait until the block is stored persistently), which actually simplify the api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
/// In-memory block store. | ||
#[derive(Debug, Default)] | ||
pub struct BlockStore(Mutex<BTreeMap<validator::BlockNumber, validator::FinalBlock>>); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Ditto as for
crate::BlockStore
; using aVecDeque
could help clarifying an invariant that only continuous ranges of blocks can be stored. - Bikeshedding: Subjectively, having multiple structs named
BlockStore
in the same crate is slightly confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- done
- ack, that's why we don't use fully qualified imports for these. We would need to prefix the struct names with the module name, which would cause unnecessary stuttering
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Co-authored-by: Bruno França <[email protected]>
node/libs/storage/src/block_store.rs
Outdated
/// Stored block with the lowest number. | ||
pub first: validator::CommitQC, | ||
/// Stored block with the highest number. | ||
pub last: validator::CommitQC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean using 2 terms to describe the same entity. It's a good rule to have a ubiquitous dictionary of terms and apply them uniformly (so that each concept has a single term); makes code easier to maintain.
node/libs/storage/src/block_store.rs
Outdated
struct Inner { | ||
queued: sync::watch::Sender<BlockStoreState>, | ||
persisted: BlockStoreState, | ||
queue: VecDeque<validator::FinalBlock>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Is queue capacity unconstrained now? Would we want to constraint it in the future, or was constrained capacity in the previous iteration a temporary decision anyway?
- Could you add a metric for the queue length (to mimic removed
GossipFetcherMetrics
in the core repo)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As per #53 (comment) I've moved the responsibility of keeping cache bounded to the user, so that both cache size limit and inflight request limit are enforced in the same place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added metrics
…g' into gprusak-tools-config
async fn state(&self, ctx: &ctx::Ctx) -> ctx::Result<BlockStoreState>; | ||
|
||
/// Gets a block by its number. | ||
/// Returns error if block is missing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify: Is this expected that the caller will know the state()
of the store beforehand, so that it won't call block()
for (probably) missing block numbers? In the general case, it could make sense to distinguish between a missing block and other errors (i.e., return ctx::Result<Option<FinalBlock>>
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, currently PersistentBlockStore is expected to never lose blocks and consensus code only requests blocks contained in the state().
* Simplified configs to not require genesis block. * Removed consensus fields from jsonrpc sync api, to prevent having gaps in miniblock quorum certificates. * Moved consensus fields to a separate table to make first/last entry lookup fast and reliable. * Migrated consensus integration to new era-consensus API: matter-labs/era-consensus#53 * Deduplicated code between EN consensus integration and main node consensus integration * Implemented support for cloning a db in tests
* Simplified configs to not require genesis block. * Removed consensus fields from jsonrpc sync api, to prevent having gaps in miniblock quorum certificates. * Moved consensus fields to a separate table to make first/last entry lookup fast and reliable. * Migrated consensus integration to new era-consensus API: matter-labs/era-consensus#53 * Deduplicated code between EN consensus integration and main node consensus integration * Implemented support for cloning a db in tests