Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc benchmark init: directly read DB based on schema #20378

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

gegaowp
Copy link
Contributor

@gegaowp gegaowp commented Nov 21, 2024

Description

main things in this pr

  • generate template queries based on tables in the DB, excluding __diesel_schema_migrations
  • enrich the template with values read from DB and execute queries in parallel
  • configs of concurrency and duration time
  • metrics collection and reports

Test plan

  • local run with a local DB populated with latest sui-indexer-alt-schema
cargo run --bin sui-rpc-benchmark -- direct \
    --db-url "postgres://postgres:postgres@localhost/gegao" \
    --concurrency 10 \
    --duration-secs 10

report with local DB


Total queries: 211772
Total errors: 0
Average latency: 0.46ms

Per-table statistics:
  obj_info                       queries: 33526    errors: 0        avg latency: 0.49ms
  ev_struct_inst                 queries: 31598    errors: 0        avg latency: 0.47ms
  tx_calls                       queries: 28752    errors: 0        avg latency: 0.45ms
  ev_emit_mod                    queries: 18321    errors: 0        avg latency: 0.44ms
  tx_affected_objects            queries: 11495    errors: 0        avg latency: 0.43ms
  tx_affected_addresses          queries: 11472    errors: 0        avg latency: 0.43ms
  sum_packages                   queries: 10759    errors: 0        avg latency: 0.55ms
  tx_kinds                       queries: 8140     errors: 0        avg latency: 0.43ms
  coin_balance_buckets           queries: 7542     errors: 0        avg latency: 0.45ms
  obj_versions                   queries: 7485     errors: 0        avg latency: 0.43ms
  kv_epoch_ends                  queries: 3750     errors: 0        avg latency: 0.45ms
  watermarks                     queries: 3742     errors: 0        avg latency: 0.44ms
  tx_digests                     queries: 3690     errors: 0        avg latency: 0.42ms
  tx_balance_changes             queries: 3662     errors: 0        avg latency: 0.42ms
  kv_checkpoints                 queries: 3619     errors: 0        avg latency: 0.43ms
  cp_sequence_numbers            queries: 3590     errors: 0        avg latency: 0.42ms
  kv_transactions                queries: 3463     errors: 0        avg latency: 0.43ms
  kv_protocol_configs            queries: 3451     errors: 0        avg latency: 0.44ms
  kv_epoch_starts                queries: 3448     errors: 0        avg latency: 0.71ms
  kv_genesis                     queries: 3424     errors: 0        avg latency: 0.42ms
  kv_objects                     queries: 3422     errors: 0        avg latency: 0.42ms
  kv_feature_flags               queries: 3421     errors: 0        avg latency: 0.42ms

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

  • Protocol:
  • Nodes (Validators and Full nodes):
  • Indexer:
  • JSON-RPC:
  • GraphQL:
  • CLI:
  • Rust SDK:
  • REST API:

Copy link

vercel bot commented Nov 21, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
sui-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Feb 3, 2025 10:27pm
3 Skipped Deployments
Name Status Preview Comments Updated (UTC)
multisig-toolkit ⬜️ Ignored (Inspect) Visit Preview Feb 3, 2025 10:27pm
sui-kiosk ⬜️ Ignored (Inspect) Visit Preview Feb 3, 2025 10:27pm
sui-typescript-docs ⬜️ Ignored (Inspect) Visit Preview Feb 3, 2025 10:27pm

@gegaowp gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:00 — with GitHub Actions Inactive
@gegaowp gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:08 — with GitHub Actions Inactive
@gegaowp gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:10 — with GitHub Actions Inactive
@gegaowp gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:12 — with GitHub Actions Inactive
@gegaowp gegaowp changed the title rpc benchmark init: direct reading DB based on schema rpc benchmark init: directly read DB based on schema Nov 21, 2024
@lxfind
Copy link
Contributor

lxfind commented Nov 25, 2024

generate template queries based on schema, specifically table primary key and indexes

What is the rational behind these generated queries? Would they be representative to what we are interested in?

@gegaowp
Copy link
Contributor Author

gegaowp commented Nov 25, 2024

@lxfind the generated queries now are basically "all supported queries based on primary key and indices" and does not consider the representativeness, I plan make it configurable for example taking in a file of weights to make it representative as a followup.

crates/sui-rpc-benchmark/src/direct/benchmark_config.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/metrics.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/metrics.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/lib.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/query_generator.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/query_executor.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/query_generator.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/lib.rs Outdated Show resolved Hide resolved
@gegaowp gegaowp temporarily deployed to sui-typescript-aws-kms-test-env January 15, 2025 09:19 — with GitHub Actions Inactive
@gegaowp gegaowp force-pushed the rpc-benchmark-init branch from 489ed78 to b2c4fac Compare January 15, 2025 09:20
@gegaowp gegaowp temporarily deployed to sui-typescript-aws-kms-test-env January 15, 2025 09:20 — with GitHub Actions Inactive
@gegaowp gegaowp requested a review from amnn January 16, 2025 04:23
Comment on lines 9 to 10
/// Duration to run the benchmark in seconds
pub duration: Duration,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this per query, or for the entire benchmark? Commenting before looking through rest of code so might become clearer, but still good to revisit and clarify

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is essentially the time-out of the while benchmark run, changed the field name to be more explicit.

crates/sui-rpc-benchmark/src/direct/metrics.rs Outdated Show resolved Hide resolved
Comment on lines +34 to +42
#[derive(Clone, Default)]
pub struct MetricsCollector {
metrics: Arc<DashMap<String, QueryMetrics>>,
}

impl MetricsCollector {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we envision MetricsCollector as an orchestrator of various benchmark reports, I wonder if we can do something similar to indexer alt framework, where record_query simply passes on received data to each BenchmarkResult struct, and generate_report just finishes each struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MetricsCollector accumulates the benchmark records in the metrics map and generates one report out of the map, so not exactly orchestration -- what's the suggested alternative way and what's the advantage ?

Comment on lines 31 to 59
let tables_query = r#"
SELECT tablename
FROM pg_tables
WHERE schemaname = 'public'
AND tablename != '__diesel_schema_migrations'
ORDER BY tablename;
"#;
let tables: Vec<String> = client
.query(tables_query, &[])
.await?
.iter()
.map(|row| row.get::<_, String>(0))
.collect();
info!(
"Found {} active tables in database: {:?}",
tables.len(),
tables
);

let pk_query = r#"
SELECT tc.table_name, kcu.column_name
FROM information_schema.table_constraints tc
JOIN information_schema.key_column_usage kcu
ON tc.constraint_name = kcu.constraint_name
WHERE tc.constraint_type = 'PRIMARY KEY'
AND tc.table_schema = 'public'
AND tc.table_name != '__diesel_schema_migrations'
ORDER BY tc.table_name, kcu.ordinal_position;
"#;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe these queries can be standalone

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pk is essentially index + uniqueness, the queries run the same way today, any reason to split them?

crates/sui-rpc-benchmark/src/direct/query_generator.rs Outdated Show resolved Hide resolved
Comment on lines 27 to 28
impl QueryGenerator {
async fn get_tables_and_indexes(&self) -> Result<Vec<BenchmarkQuery>, anyhow::Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure if this needs to be a struct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the advantage of avoiding that?

Comment on lines 19 to 20
/// against the database. It can “enrich” each BenchmarkQuery by sampling real
/// data from the relevant table. Each query’s execution is timed and recorded
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// against the database. It can “enrich” each BenchmarkQuery by sampling real
/// data from the relevant table. Each query’s execution is timed and recorded
/// against the database. It can “enrich” each BenchmarkQueryTemplate by sampling real
/// data from the relevant table. Each query’s execution is timed and recorded

The generated query templates are on pk and indexes, and need to be enriched with values sampled from the relevant table

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it'd be great if we could create general sampling guidelines, i.e select some value that is most represented, rarest, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re being presentative, makes sense and sounds further out, I would like to have a runnable tool and see how it can help us before polishing things.

crates/sui-rpc-benchmark/src/direct/query_executor.rs Outdated Show resolved Hide resolved
Comment on lines 65 to 87
let query_generator = QueryGenerator {
db_url: db_url.clone(),
};
let benchmark_queries = query_generator.generate_benchmark_queries().await?;
info!("Generated {} benchmark queries", benchmark_queries.len());

let config = BenchmarkConfig {
concurrency,
duration: Duration::from_secs(duration_secs),
};

let mut query_executor = QueryExecutor::new(&db_url, benchmark_queries, config).await?;
let result = query_executor.run().await?;
info!("Total queries: {}", result.total_queries);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, i was thinking instead we'd have a

  1. generate template phase
  2. enrich template with values sampled from db. or, inject values provided by user. user can set sampling technique
  3. actual benchmarking

Copy link
Contributor

@amnn amnn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the doc comments, they really help make sense of the design. Some high level thoughts:

  • It would be nice to have a clearer phase separation between doing the work to sample the data from the database, enriching queries, and then running the benchmarks. (I think the first two steps in particular could be separated more).
  • It would be very worthwhile to re-use prometheus for implementing various kinds of metric, maybe even exposing them over a metrics service (we have pulled out the metrics service we use in the indexer and RPC into its own crate now as well). This would allow us to focus on worrying about what data we want to track, and not on how to implement, e.g. a sampling histogram.

crates/sui-rpc-benchmark/src/direct/benchmark_config.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/metrics.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/metrics.rs Outdated Show resolved Hide resolved
table_name,
queries: metrics.total_queries,
errors: metrics.errors,
avg_latency_ms: avg_latency,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to track percentiles as well as the mean (p50, p90, p99). Can we re-use prometheus for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conceptually it's good to also track percentiles and prometheus can come in handy, practically my intent here is to get a sense when DB is overwhelmed, and avg latency suffices for that purpose -- happy to extend that to percentiles if we find outputs of the tool useful and decide to add more granular data.

crates/sui-rpc-benchmark/src/direct/query_executor.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/direct/query_executor.rs Outdated Show resolved Hide resolved
Comment on lines +154 to +85
let params: Vec<Box<dyn ToSql + Sync + Send>> = row
.iter()
.map(|val| match val {
SqlValue::Text(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Int4(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Int8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Float8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Bool(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Int2(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Bytea(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
})
.collect();
let param_refs: Vec<&(dyn ToSql + Sync)> = params
.iter()
.map(|p| p.as_ref() as &(dyn ToSql + Sync))
.collect();

let query_str = enriched.query.query_template.clone();

let start = Instant::now();
let result = client.query(&query_str, &param_refs[..]).await;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The boxing step here should be unnecessary:

Suggested change
let params: Vec<Box<dyn ToSql + Sync + Send>> = row
.iter()
.map(|val| match val {
SqlValue::Text(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Int4(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Int8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Float8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Bool(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Int2(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
SqlValue::Bytea(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
})
.collect();
let param_refs: Vec<&(dyn ToSql + Sync)> = params
.iter()
.map(|p| p.as_ref() as &(dyn ToSql + Sync))
.collect();
let query_str = enriched.query.query_template.clone();
let start = Instant::now();
let result = client.query(&query_str, &param_refs[..]).await;
let params: Vec<&dyn (ToSql + Sync)>> = row
.iter()
.map(|val| match val {
SqlValue::Text(v) => v,
SqlValue::Int4(v) => v,
SqlValue::Int8(v) => v,
SqlValue::Float8(v) => v,
SqlValue::Bool(v) => v,
SqlValue::Int2(v) => v,
SqlValue::Bytea(v) => v,
})
.collect();
let query_str = enriched.query.query_template.clone();
let start = Instant::now();
let result = client.query(&query_str, &params[..]).await;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it errors out if using & instead of boxing it

cargo build
   Compiling sui-rpc-benchmark v1.42.0 (/Users/gegao/Documents/sui/crates/sui-rpc-benchmark)
error[E0308]: `match` arms have incompatible types
   --> crates/sui-rpc-benchmark/src/direct/query_executor.rs:160:42
    |
158 |                   .map(|val| match val {
    |  ____________________________-
159 | |                     SqlValue::Text(v) => v,
    | |                                          - this is found to be of type `&Option<std::string::String>`
160 | |                     SqlValue::Int4(v) => v,
    | |                                          ^ expected `&Option<String>`, found `&Option<i32>`
161 | |                     SqlValue::Int8(v) => v,
...   |
165 | |                     SqlValue::Bytea(v) => v,
166 | |                 })
    | |_________________- `match` arms have incompatible types
    |
    = note: expected reference `&Option<std::string::String>`
               found reference `&Option<i32>`

For more information about this error, try `rustc --explain E0308`.
error: could not compile `sui-rpc-benchmark` (lib) due to 1 previous error

after tweaking the suggested change to let params: Vec<&dyn ToSql + Sync> = row, just removing the un-matched > and un-necessary ()

Comment on lines 182 to 184
if self.enriched_queries.is_empty() {
self.initialize_samples().await?;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to do this in a mutable way like this, or could we have a function that returns a set of enriched queries to run?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure but it does not matter as initialize_samples is a single threaded prep step?

crates/sui-rpc-benchmark/src/direct/query_generator.rs Outdated Show resolved Hide resolved
crates/sui-rpc-benchmark/src/lib.rs Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants