feat(sync): Sync refactor #495

cchudant · 2025-02-14T13:14:14Z

Sync is the service that is responsible for importing blocks into the node. It is currently only run in full-node mode, as we do not need to sync any blocks when running in sequencer (block production) mode.

Motivation

Peer-to-peer

Peer-to-peer block sync is a feature that will be required for all starknet full nodes to implement. We are already very late to the party, as pathfinder, juno and papyrus have had p2p sync capabilities for more than 6 months now.
This is also of great interest to us at KasarLabs for our upcoming launch of the Quaza, where we want to showcase peer-to-peer.

Our peer-to-peer implementation is still experimental, which is why we are not yet considering getting it merged upstream into Madara. Since it touches a big part of the core, we need to merge its architectural changes in a separate PR to keep the P2P feature merge-ready for the Madara Core codebase at any time. With that in mind, the Madara alliance has agreed on a flow for upstreaming stuff into the main branch: if there are any core modifications, we’ll try to merge them upstream. But this is a huge job because we want to minimize breakage, which means the new refactored code should work just like the old one and keep or even improve all the test coverage.

The details of that flow are still being debated, but this pull request could be the experiment that tell us how well that goes :)

As such, this pull request contains

removal of the mc-block-import crate and mc-sync in its current form.
a new mc-sync crate
some changes around the database as to allow storing of partially downloaded blocks.
some plumbing around block-production has been lightly changed, and new types in primitives are added to make the code cleaner.

Moreover, the other node teams (including us!) are currently also working on consensus (tendermint) integration, which has a lot of overlap with this work. In madara's case at least, we are integrating InformalSystems's Malachite implementation.

Reorg handling

This refactor is a very big step forward in terms of reorg handling, and has been designed for that from the start. The biggest win here is that the pipeline is now centrally controllable from a single point. This means that reorg-handling can be implemented on top of this pull request without madara becoming a big spaghetti mess. More on that in the Details section of this pull request description.

Performance

The current sync service has okay performance but it can be improved a lot more. One of the biggest slowdowns during sync is the fact that class compilation slows the whole pipeline. Class definitions don't happen at every block, but when they do, they make the pipeline wait on them to finish and you cannot parallelize the compilation of a single class. This results in big spikes in performance and a lot of cpu-underutilization.

This new sync implementation does a much better job at amortizing the cost of class compilation on a lot more blocks - it does this by splitting classes sync on a new different pipeline, which is now completely free to lag behind (up to a point) compared to the other sync pipelines.

I have mesured on my Apple M4 pro, sync to 60k blocks on Sepolia now takes 15m35 = 64.17 blocks/s. That is an immense speedup. We are now limited by the rate-limiting of the starkware gateway (we have a gateway key, but we still hit some sort of limit as syncing from a local gateway can hit much higher blocks/s).
We are also limited on some ranges of Sepolia blocks by the bonsai-trie global trie insertion, which I really want to optimize a lot more at some point. We are still not hitting 100% cpu on every core all of the time yet.

In details

The current sync service works by having 3 tasks:

You can see this in action in mc-sync l2.rs.

This simplistic approach has correct performance, but it is not suitable for peer-to-peer sync. P2p sync has a number of unique challenges:

Any peer can be lying to you. You need to check everything, maintain peer scores, and punish/disconnect the peers accordingly.
The p2p specs have a single-request multi-response design, where you send a request, and you get a stream of responses for a range of blocks.
The p2p request/stream response "endpoints" (let's call them that), are separated into: headers, transactions and receipts, state diff, classes, events.

Moreover, you cannot verify that eg. an event is correct unless you have already received the header for that block. Classes are different; they cannot be checked using the class commitment in the header, as there is no such field. Instead, you have to check them using the fields in the block state diff, which in turn you can check using the state diff commitment in the header.

The dependency diagram for verification in the p2p specs looks like this

Note: Backward sync is needed as it is not guaranteed, as per the specs, that the signature field will be populated for old blocks that are not yet on L1. This has to do with the fact that the starkware feeder-gateway does not have a static pubkey.

Second note: When tendermint consensus will be integrated, this signature verification step (forward sync) will check consensus signatures instead

As you can see, our requirements for p2p sync are quite a bit more complex than the the current madara sync, and it cannot fit into the existing structure. Thus, we decided to refactor all of it into nice abstractions that supports both peer-to-peer sync as well as gateway sync.

In this refactor, a Pipeline refers to a Parallel-over-N-blocks step + Sequential one block at a time step combination.
Our peer-to-peer forward sync implementation (not present in this pull request) has a pipeline for each of the "p2p stream endpoints" (mentioned earlier), and a pipeline for Apply State that only has a sequential step, which is when we update the global tries.

This PipelineController abstraction supports retries in the sequential step, which, while it is not used by the Gateway sync implementation present in this pull request, is very important for peer to peer as there are cases (particularily in the headers pipeline) where we cannot know whether a peer has returned wrong data until the sequential step. In that case, we want to reschedule a parallel step for that block and start over with a new peer. This is the main reason this abstraction looks the way it does.

On top of that, we have a ForwardSync abstraction, which is responsible for responsible for following L1, and also running a "probe" - a small future that returns the highest known block on the network/gateway.

This is how ForwardSync is used by the new Gateway sync. As you can see, the abstractions fit together neatly and it becomes easy to drive all of the pipelines. I believe this is a big deal, as this makes the Reorg support easy to support in the future: we have a single centralized place that has direct control over all of pipelines, and we can abort any fetch or task cleanly there at will.

Other details include:

The backend now has different sync-tip (latest block_n counter) for every pipelines.
The deletion of the block-import implies that the global tries are updated elsewhere now. They're updated in mc-db - the code is cleaner now as more of the logic is shared with block-production.
In addition, closing a block (going from pending -> closed block) is now implemented in primitives. This makes block-production a lot cleaner as well. Moreover, we use PendingBlocks a lot more as a general construct for a "non-closed block".

Here is the implementation of ForwardSync in our p2p sync (not in this PR) implementation, if you are curious.

TODO

Few things that I am still missing before this pull request can be merged, but it can still be reviewed.

More testing: re-add the feeder gateway mocking test.
Fix pre-v0.12.3 block hash verification.
Add the --n-blocks-to-sync -> --sync-stop-at breaking change in the CHANGELOG.md file

…hing issues

… and refactor the existing stream mapping system

cchudant · 2025-02-14T15:50:08Z

Breaking changes:

DB schema is different.
I have removed the --n-blocks-to-sync <number of blocks> cli arg, because I believe this arg is better expressed as --sync-stop-at <block_n>. I believe sync-stop-at is the objectively better version of that cli argument. What do you think? I do not see any use-case for n-blocks-to-sync, but if anyone really wants it I can add it back and we can have both.
I have removed --no-sync-polling because it doesn't quite make sense in this new architecture.

Trantorian1 · 2025-02-18T16:41:52Z

crates/madara/client/db/src/block_db.rs

@@ -123,7 +110,8 @@ impl MadaraBackend {

    #[tracing::instrument(skip(self), fields(module = "BlockDB"))]
    pub fn get_latest_block_n(&self) -> Result<Option<u64>> {
-        get_latest_block_n(&self.db)
+        Ok(self.head_status().latest_full_block_n())
+        // get_latest_block_n(&self.db)


Suggested change

// get_latest_block_n(&self.db)

Trantorian1 · 2025-02-18T16:57:50Z

crates/madara/client/db/src/db_block_id.rs

+pub enum RawDbBlockId {
+    Pending,
+    Number(u64),
+}


Why not implement this as a wrapper or alias around DbBlockId?

so that you can write RawDbBlockId::Pending and DbBlockId::Pending

crates/madara/client/db/src/lib.rs

Trantorian1 · 2025-02-19T08:22:03Z

crates/madara/client/db/src/lib.rs

+        Ok(Arc::new(backend))
+    }
+
+    pub async fn on_block(&self, block_n: u64) -> anyhow::Result<()> {


Maybe add a doc comemnt explaining what this function does in more details? Seems pretty vague atm.

Trantorian1 · 2025-02-19T10:33:48Z

crates/madara/primitives/class/src/lib.rs

+impl fmt::Display for ClassType {
+    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
+        match self {
+            Self::Sierra => write!(f, "Sierra"),
+            Self::Legacy => write!(f, "Legacy"),
+        }
+    }
+}


Why not just rely on Debug?

Trantorian1 · 2025-02-19T10:37:31Z

crates/madara/primitives/convert/src/felt.rs

+        Felt::from_bytes_checked(&unpacked)
+    }
+
+    fn from_bytes_checked(b: &[u8; 32]) -> Result<Felt, MalformatedFelt> {


Suggested change

fn from_bytes_checked(b: &[u8; 32]) -> Result<Felt, MalformatedFelt> {

fn from_bytes_be_checked(b: &[u8; 32]) -> Result<Felt, MalformatedFelt> {

Trantorian1 · 2025-02-19T10:39:04Z

crates/madara/primitives/convert/src/felt.rs

We should and tests for the slice and bytes conversions

ah uh this is only going to be used for p2p, i'll remove it from this PR i think
but I am pretty sure I had tests for that - if it's not in that PR I don't know where they went :)

Trantorian1 · 2025-02-19T10:44:00Z

crates/madara/primitives/receipt/src/from_blockifier.rs

            transaction_hash,
            actual_fee,
            messages_sent,
            events,
            execution_resources,
            execution_result,
-            message_hash: message_hash.unwrap(), // it's a safe unwrap because it would've panicked earlier if it was Err
+            // This should not panic unless blockifier gives a garbage receipt.
+            // TODO: we should have a soft error here just in case.


We should probably make this a soft error as part of this PR then no?

let's not, or it's going to get painful when charpa updates blockifier.

Trantorian1 · 2025-02-19T10:47:42Z

crates/madara/primitives/utils/src/rayon.rs

@@ -22,7 +22,7 @@ impl Default for RayonPool {
 impl RayonPool {
    pub fn new() -> Self {
        let n_cores = thread::available_parallelism().expect("Getting the number of cores").get();
-        let max_tasks = n_cores * 2;
+        let max_tasks = n_cores * 128;


Why the sudden change to this constant? It feels like there should be more info around this magic number.

mostly just vibes 🤷
i do not know why I changed that, to be perfectly honest. I think I wanted to test the change and did not find any evidence that it's faster or slower, and forgot to remove it
the only important thing here is that it's bounded, i don't think the actual number of the bound matters that much

fair enough!

I mean then again with that high a number of tasks performance might not have changed because no more than 2 * cores tasks were being processed at once so this was still under the bound of the previous limit.

It feels like 128 * cores leaves more for DOS though. Especially if this didn't result in any noticeable change in performance maybe we might want to revert this change. Wdyt?

Trantorian1

I apologize in advance, most of this review is about documentation X) (among a few clarifying questions and some nitpicks).

My worry is that since this PR introduces some pretty large and sweeping architectural changes that it will be easy for new and existing contributors alike to get lost in the codebase unless they have done a full review of this PR themselves. This is a lot of context to require, and it feels like we can make this easier by documenting the new sync architecture as part of the code. IMO this issue's opening comment should be added to the codebase, or at least linked, as it is an excelled overview of these changes.

The code seems good to me, though it is hard to review logic at a very granular level due to the size of this PR, but changes seem consistent with what was already present in the previous sync model.

I definitely feel like this makes tests all the more important, and there are some very important areas of the codebase which I think need more of those (though I think we are testing all the sync pipelines? Still feels like some db methods around verify and update in particular should be tested in isolation + maybe test more failing cases?). Please let me know what you think.

Overall, this is definitely an amazing improvement to the current sync pipeline and some very impressive work :)

Trantorian1 · 2025-02-19T10:49:43Z

crates/madara/tests/src/lib.rs

-        "--n-blocks-to-sync",
-        "20",
+        "--sync-stop-at",
+        "19",


Does this mean we still sync block 20 with --sync-stop-at 19 or that the previous version would actually stop at block 19?

the previous version stops at 19, since there are 20 blocks between 0..=19, as we have to import block 0 too.

Okay, this is much clearer then

Trantorian1 · 2025-02-19T10:54:46Z

Well, for some reason github running at 1fps decided to split my review into two, strange 🤷‍♂️

cchudant and others added 28 commits November 6, 2024 15:43

fix(db): fix number of files in db, startup hang, ram issues and flus…

68b2aa7

…hing issues

p2p: start

45703c8

feat: make kad work

aa3b9aa

p2p: pathfinder can now sync from us

4f39098

sync2 refactor

b950296

pipeline controller, debugging now

b3a5be0

fix: can now sync txs and headers

eeeb54c

p2p: added response stream mapping for events, classes, state diffs -…

837273b

… and refactor the existing stream mapping system

feat(sync/p2p): added the other sync pipelines :)

e09dcd5

fix: legacy classes, hashes

ec97bf2

feat(sync): probe, sync abstraction and gateway sync

e7e540b

feat: hook up gateway pipelines, now working

825589b

fix(p2p-sync): can now sync from madara 🥳

edbfae2

Merge remote-tracking branch 'origin/main' into p2p_4

bfd284d

cleaning up

11bf16f

fix: save chain tip to db

5fa623c

fix: typo

65a4667

fix: classes should be saved sequentially

b0e2b6e

fix: pipeline polling order

c0a7b29

nicer log messages

8a0c70f

re-add sync metrics

23bc787

re-add pending blocks, tidy probe, re-add unsafe-starting-block

43fbf87

re-add flush every n blocks, backup every n blocks

647475d

fix: rpc and gateway should not be able to see incomplete blocks in db

5c9de35

wrap updates

6a4f902

warp updates fixed

ddd9ab1

fix tests

144b9fc

Remove p2p and prepare for upstreaming

41eae28

Trantorian1 assigned cchudant Feb 14, 2025

Trantorian1 added the feature Request for new feature or enhancement label Feb 14, 2025

Trantorian1 added node Related to the full node implementation research Research and exploration required before implementation labels Feb 14, 2025

antiyro changed the title ~~Sync refactor~~ feat(sync): Sync refactor Feb 17, 2025

Adding some tests

85b6581

cchudant marked this pull request as ready for review February 17, 2025 17:02

cchudant added 2 commits February 18, 2025 15:12

fix tests

70c6d21

add changelog

2dc9e7a

Trantorian1 self-requested a review February 18, 2025 15:52

Trantorian1 reviewed Feb 19, 2025

View reviewed changes

Trantorian1 requested changes Feb 19, 2025

View reviewed changes

cchudant added 2 commits February 21, 2025 13:01

fix mainnet

729a766

address review comments

c4c1fcd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sync): Sync refactor #495

feat(sync): Sync refactor #495

cchudant commented Feb 14, 2025 •

edited

Loading

cchudant commented Feb 14, 2025 •

edited

Loading

Trantorian1 Feb 18, 2025

Trantorian1 Feb 18, 2025

cchudant Feb 19, 2025

Trantorian1 Feb 19, 2025

Trantorian1 Feb 19, 2025

Trantorian1 Feb 19, 2025

Trantorian1 Feb 19, 2025

cchudant Feb 19, 2025

Trantorian1 Feb 19, 2025 •

edited

Loading

cchudant Feb 24, 2025

Trantorian1 Feb 19, 2025

cchudant Feb 19, 2025 •

edited

Loading

Trantorian1 Feb 20, 2025

Trantorian1 Feb 20, 2025

cchudant Feb 24, 2025

Trantorian1 left a comment

Trantorian1 Feb 19, 2025

cchudant Feb 19, 2025

Trantorian1 Feb 20, 2025

Trantorian1 commented Feb 19, 2025

	fn from_bytes_checked(b: &[u8; 32]) -> Result<Felt, MalformatedFelt> {
	fn from_bytes_be_checked(b: &[u8; 32]) -> Result<Felt, MalformatedFelt> {

feat(sync): Sync refactor #495

Are you sure you want to change the base?

feat(sync): Sync refactor #495

Conversation

cchudant commented Feb 14, 2025 • edited Loading

Motivation

Peer-to-peer

Reorg handling

Performance

In details

TODO

cchudant commented Feb 14, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Trantorian1 Feb 19, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cchudant Feb 19, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Trantorian1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Trantorian1 commented Feb 19, 2025

cchudant commented Feb 14, 2025 •

edited

Loading

cchudant commented Feb 14, 2025 •

edited

Loading

Trantorian1 Feb 19, 2025 •

edited

Loading

cchudant Feb 19, 2025 •

edited

Loading