Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(zebra-scan): Immediately scan newly verified blocks #7905

Closed
3 tasks
Tracked by #7728
oxarbitrage opened this issue Nov 3, 2023 · 11 comments
Closed
3 tasks
Tracked by #7728

feat(zebra-scan): Immediately scan newly verified blocks #7905

oxarbitrage opened this issue Nov 3, 2023 · 11 comments
Labels
A-blockchain-scanner Area: Blockchain scanner of shielded transactions C-feature Category: New features S-needs-triage Status: A bug report needs triage

Comments

@oxarbitrage
Copy link
Contributor

oxarbitrage commented Nov 3, 2023

Motivation

This ticket is a performance improvement to the scanner task in zebra-scan. The scanner task's primary responsibility is to receive a block and scan it for keys stored in the database.

In #7953 we wait 10 seconds for the next block. If we await a change notification, we can scan that block as soon as it arrives.

Possible Design

To complete this task, we could:

  • wait for ChainTipChange.tip_block_change(), ask the state for that block by hash, and scan it
  • multiple blocks can arrive at the same time, so we also need to scan any blocks between the tip and the last scanned block
  • test this using a fake chain tip

Optional / Out-of-Scope

Pseudo-code brainstorm:
let (new_viewer_tx, new_viewer_rx) = chan();

connection task: {
    while let Some((addr, conn)) = listen().await {
        // TODO: await a message for the query range and viewing key
        //       with a timeout and in another task
        let queries: Vec<(SaplingIvk, RangeBounds<Height>)>;
        _ = new_viewer_tx.send((Client::new(addr, conn), queries));
    }
}

let (new_tail_viewers_tx, new_tail_viewers_rx) = chan();

scan task manager: {
    let mut initial_scans: FuturesUnordered<Result<()>>;

    // TODO: Avoid duplicate scanning of block ranges that are already scheduled
    //       to be scanned for the query of another viewer.
    while let Some((client, queries)) = new_viewer_rx.recv() {
        // min start bound and max end bound of all new viewer query ranges
        let query_range = query_range_max_end_bound(queries.map(|(_, range)| range), tip_height());

        // TODO: create a `blocks` fn that returns an iterator over heights in the query range
        initial_scans.push(scan_range(new_viewers, get_blocks(query_range)));

        // send any viewers that are querying past the tip height to the scan tail task
        let contains_tail_height = |(_, (_, query_range))| query_range.contains(tip_height().next());
        let new_tail_viewers = new_viewers.filter(contains_tail_height);
        let _ = new_tail_viewers_tx.send(new_tail_viewers);
    }
}

scan tail task: {
    let mut viewers: HashMap<Client, Vec<(SaplingIvk, RangeBounds<Height>)>>;
    let mut scans: FuturesUnordered<Result<()>>;

    while let Some(tip_action) = chain_tip_change.wait_for_tip_change().await {
        if let Some(new_viewers) = new_tail_viewers_rx.recv() {
            viewers.extend(new_viewers);
        }

        viewers.retain(|(_, (_, query_range))| query_range.contains(block.height));

        let blocks = if tip_action.is_reset() {
            // TODO: scan all blocks after common ancestor block if there's a reset
            get_blocks(finalized_tip_height().next()..)
        } else {
            // TODO: get blocks from the tip action in this task
            get_blocks(tip_action.best_tip_height())
        };

        scans.push(scan_block(viewers, blocks));
    }
}
@mpguerra mpguerra added this to Zebra Nov 3, 2023
@github-project-automation github-project-automation bot moved this to 🆕 New in Zebra Nov 3, 2023
@oxarbitrage oxarbitrage added the A-blockchain-scanner Area: Blockchain scanner of shielded transactions label Nov 3, 2023
@mpguerra mpguerra added P-Medium ⚡ C-feature Category: New features S-needs-triage Status: A bug report needs triage labels Nov 6, 2023
@mpguerra
Copy link
Contributor

mpguerra commented Nov 7, 2023

Hey team! Please add your planning poker estimate with Zenhub @arya2 @oxarbitrage @teor2345 @upbqdn

@oxarbitrage
Copy link
Contributor Author

This comment might be useful here:

#7904 (comment)

@teor2345
Copy link
Contributor

teor2345 commented Nov 7, 2023

I have some suggestions for the scope of this ticket:

Block Range Scanning & Scanning Strategy

Scanning arbitrary supplied block ranges should be split into another ticket, because it depends on the scanning strategy and scanner design. I think it would be helpful to write up a draft scanning strategy with the team and with ECC before we design and implement it.

For now let's just scan new blocks in this ticket, then drop the results?
(The result channel is out of scope because channels are part of #7906.)

That moves this part of the pseudocode to another ticket:

let (new_viewer_tx, new_viewer_rx) = chan();

connection task: {
    while let Some((addr, conn)) = listen().await {
        // TODO: await a message for the query range and viewing key
        //       with a timeout and in another task
        let queries: Vec<(SaplingIvk, RangeBounds<Height>)>;
        _ = new_viewer_tx.send((Client::new(addr, conn), queries));
    }
}

let (new_tail_viewers_tx, new_tail_viewers_rx) = chan();

scan task manager: {
    let mut initial_scans: FuturesUnordered<Result<()>>;

    // TODO: Avoid duplicate scanning of block ranges that are already scheduled
    //       to be scanned for the query of another viewer.
    while let Some((client, queries)) = new_viewer_rx.recv() {
        // min start bound and max end bound of all new viewer query ranges
        let query_range = query_range_max_end_bound(queries.map(|(_, range)| range), tip_height());

        // TODO: create a `blocks` fn that returns an iterator over heights in the query range
        initial_scans.push(scan_range(new_viewers, get_blocks(query_range)));

        // send any viewers that are querying past the tip height to the scan tail task
        let contains_tail_height = |(_, (_, query_range))| query_range.contains(tip_height().next());
        let new_tail_viewers = new_viewers.filter(contains_tail_height);
        let _ = new_tail_viewers_tx.send(new_tail_viewers);
    }
}

Block Retrieval

This pseudocode can be modified to split out the scanning strategy, and multiple receivers via a broadcast channel (#7906). Running this code in its own process is also out of scope for this project.

scan task: {
    while let Some(tip_action) = chain_tip_change.wait_for_tip_change().await {
        let blocks = tip_action.get_new_blocks();

        scan_blocks(new_blocks);
    }
}

This code is part of one alternative design for ticket #7906, so we don't need to implement it here:

        if let Some(new_viewers) = new_tail_viewers_rx.recv() {
            viewers.extend(new_viewers);
        }

@arya2 did you want to open a ticket to cover the changes we discussed to ChainTipChange, or did you want me to?

get_new_blocks: {
        if tip_action.is_reset() {
            // scan all blocks from the chain fork if there's a reset
            tip_action.get_forked_blocks()
        } else {
            tip_action.get_block()
        };
}

@teor2345 teor2345 changed the title feat(zebra-scan): Create the scanner task feat(zebra-scan): Create a scanner task for new blocks Nov 10, 2023
@mpguerra
Copy link
Contributor

mpguerra commented Nov 16, 2023

Do we need to create another issue for scanning the existing blockchain, i.e finalised blocks only? I feel like that should come before this

@teor2345
Copy link
Contributor

Do we need to create another issue for scanning the existing blockchain, i.e finalised blocks only? I feel like that should come before this

Yes, there should be another ticket, I just made #7953. It's complicated to do both things in the same task, so they should be separate tickets, PRs, and tasks.

Scanning new blocks is about the same complexity as scanning existing blocks:

  • new blocks: wait for ChainTipChange.tip_block_change(), ask the state for that block by hash, then scan it.
  • existing blocks: count from height 0 to LatestChainTip.best_tip_height(), ask the state for each block, then scan it. This is easier to test with a pre-populated state.

But one of them should block the other, so we don't duplicate the scanning function work.

@teor2345
Copy link
Contributor

@upbqdn for the MVP I suggest we do the simplest thing that works. Here's a simpler approach, feel free to cut it down:

  • wait for ChainTipChange.tip_block_change(), ask the state for that block by hash, then scan it
  • test this using a fake chain tip

@teor2345
Copy link
Contributor

Do we need to create another issue for scanning the existing blockchain, i.e finalised blocks only?

Just a clarification here: outside the state there is no distinction between finalised blocks (the single chain of blocks on disk) and non-finalised blocks (blocks in memory in chain forks). So for the MVP I suggest we don't care either.

The difference outside the state is between blocks that have already verified, and blocks that haven't been verified or downloaded yet. They need very different code, because one waits, and the other doesn't.

@mpguerra
Copy link
Contributor

This has been estimated as a "Large" or 8 points... We should consider splitting this issue up into smaller parts

@arya2
Copy link
Contributor

arya2 commented Nov 18, 2023

This has been estimated as a "Large" or 8 points... We should consider splitting this issue up into smaller parts

The scope's been significantly reduced since we estimated this issue, but the changes may be blocked by #7908, and closing this issue is blocked by #7947.

@arya2 arya2 changed the title feat(zebra-scan): Create a scanner task for new blocks feat(zebra-scan): Scan newly verified blocks Nov 20, 2023
@mpguerra
Copy link
Contributor

This does not need to be in the MVP, un-scheduling

@teor2345 teor2345 changed the title feat(zebra-scan): Scan newly verified blocks feat(zebra-scan): Immediately scan newly verified blocks Nov 28, 2023
@mpguerra
Copy link
Contributor

We won't be doing any more work on the scanner

@mpguerra mpguerra closed this as not planned Won't fix, can't repro, duplicate, stale Oct 18, 2024
@github-project-automation github-project-automation bot moved this from New to Done in Zebra Oct 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-blockchain-scanner Area: Blockchain scanner of shielded transactions C-feature Category: New features S-needs-triage Status: A bug report needs triage
Projects
Status: Done
Development

No branches or pull requests

5 participants