Skip to content

Commit

Permalink
[indexer-alt] expose mapping from cp to tx or epoch interval in index…
Browse files Browse the repository at this point in the history
…er-alt-framework for pruners if needed (#20605)

## Description

With the `cp_sequence_numbers` table, pruner tasks that need information
beyond the pruner checkpoint range can depend on the table for the
corresponding interval. A pruner task can now call `tx_interval` or
`epoch_interval`, and will throw an anyhow error if the mapping for the
checkpoint cannot be retrieved.

## Test plan 

How did you test the new or updated feature?

---

## Release notes

Check each box that your changes affect. If none of the boxes relate to
your changes, release notes aren't required.

For each box you select, include information after the relevant heading
that describes the impact of your changes that a user might notice and
any actions they must take to implement updates.

- [ ] Protocol: 
- [ ] Nodes (Validators and Full nodes): 
- [ ] Indexer: 
- [ ] JSON-RPC: 
- [ ] GraphQL: 
- [ ] CLI: 
- [ ] Rust SDK:
- [ ] REST API:
  • Loading branch information
wlmyng authored Jan 8, 2025
1 parent 0910ada commit 09aea99
Show file tree
Hide file tree
Showing 5 changed files with 98 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -3,23 +3,14 @@

use std::sync::Arc;

use crate::models::cp_sequence_numbers::StoredCpSequenceNumbers;
use crate::pipeline::{concurrent::Handler, Processor};
use crate::schema::cp_sequence_numbers;
use anyhow::Result;
use diesel::prelude::*;
use diesel_async::RunQueryDsl;
use sui_field_count::FieldCount;
use sui_pg_db::{self as db};
use sui_types::full_checkpoint_content::CheckpointData;

#[derive(Insertable, Selectable, Queryable, Debug, Clone, FieldCount)]
#[diesel(table_name = cp_sequence_numbers)]
pub struct StoredCpSequenceNumbers {
pub cp_sequence_number: i64,
pub tx_lo: i64,
pub epoch: i64,
}

pub struct CpSequenceNumbers;

impl Processor for CpSequenceNumbers {
Expand Down
1 change: 1 addition & 0 deletions crates/sui-indexer-alt-framework/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ use watermarks::{CommitterWatermark, PrunerWatermark};
pub mod handlers;
pub mod ingestion;
pub(crate) mod metrics;
pub mod models;
pub mod pipeline;
pub(crate) mod schema;
pub mod task;
Expand Down
86 changes: 86 additions & 0 deletions crates/sui-indexer-alt-framework/src/models/cp_sequence_numbers.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
// Copyright (c) Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

use crate::schema::cp_sequence_numbers;
use anyhow::{bail, Result};
use diesel::prelude::*;
use diesel_async::RunQueryDsl;
use std::ops::Range;
use sui_field_count::FieldCount;
use sui_pg_db::Connection;

#[derive(Insertable, Selectable, Queryable, Debug, Clone, FieldCount)]
#[diesel(table_name = cp_sequence_numbers)]
pub struct StoredCpSequenceNumbers {
pub cp_sequence_number: i64,
pub tx_lo: i64,
pub epoch: i64,
}

/// Inclusive start and exclusive end range of prunable txs.
pub async fn tx_interval(conn: &mut Connection<'_>, cps: Range<u64>) -> Result<Range<u64>> {
let result = get_range(conn, cps).await?;

Ok(Range {
start: result.0.tx_lo as u64,
end: result.1.tx_lo as u64,
})
}

/// Returns the epochs of the given checkpoint range. `start` is the epoch of the first checkpoint
/// and `end` is the epoch of the last checkpoint.
pub async fn epoch_interval(conn: &mut Connection<'_>, cps: Range<u64>) -> Result<Range<u64>> {
let result = get_range(conn, cps).await?;

Ok(Range {
start: result.0.epoch as u64,
end: result.1.epoch as u64,
})
}

/// Gets the tx and epoch mappings for the given checkpoint range.
///
/// The values are expected to exist since the cp_mapping table must have enough information to
/// encompass the retention of other tables.
pub(crate) async fn get_range(
conn: &mut Connection<'_>,
cps: Range<u64>,
) -> Result<(StoredCpSequenceNumbers, StoredCpSequenceNumbers)> {
let Range {
start: from_cp,
end: to_cp,
} = cps;

if from_cp >= to_cp {
bail!(format!(
"Invalid checkpoint range: `from` {from_cp} must be less than `to` {to_cp}"
));
}

let results = cp_sequence_numbers::table
.select(StoredCpSequenceNumbers::as_select())
.filter(cp_sequence_numbers::cp_sequence_number.eq_any([from_cp as i64, to_cp as i64]))
.order(cp_sequence_numbers::cp_sequence_number.asc())
.load::<StoredCpSequenceNumbers>(conn)
.await
.map_err(anyhow::Error::from)?;

let Some(from) = results
.iter()
.find(|cp| cp.cp_sequence_number == from_cp as i64)
else {
bail!(format!(
"No checkpoint mapping found for checkpoint {from_cp}"
));
};
let Some(to) = results
.iter()
.find(|cp| cp.cp_sequence_number == to_cp as i64)
else {
bail!(format!(
"No checkpoint mapping found for checkpoint {to_cp}"
));
};

Ok((from.clone(), to.clone()))
}
4 changes: 4 additions & 0 deletions crates/sui-indexer-alt-framework/src/models/mod.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
// Copyright (c) Mysten Labs, Inc.
// SPDX-License-Identifier: Apache-2.0

pub mod cp_sequence_numbers;
6 changes: 6 additions & 0 deletions crates/sui-indexer-alt/src/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,9 @@ The required flags are --remote-store-url (or --local-ingestion-path) and the --
```
cargo run --bin sui-indexer-alt -- --database-url {url} indexer --remote-store-url https://checkpoints.mainnet.sui.io --skip-watermark --first-checkpoint 68918060 --last-checkpoint 68919060 --config indexer_alt_config.toml
```

## Pruning
To enable pruning, the `cp_sequence_numbers` pipeline must be enabled. Otherwise, even if pruning logic is
configured for a table, the pruner task itself will skip if it cannot find a mapping for the
checkpoint pruning watermark. Only one committer needs to update this table - it is not necessary
for every indexer instance to have this pipeline enabled.

0 comments on commit 09aea99

Please sign in to comment.