rpc benchmark init: directly read DB based on schema #20378

gegaowp · 2024-11-21T23:00:10Z

Description

main things in this pr

generate template queries based on tables in the DB, excluding __diesel_schema_migrations
enrich the template with values read from DB and execute queries in parallel
configs of concurrency and duration time
metrics collection and reports

Test plan

local run with a local DB populated with latest sui-indexer-alt-schema

cargo run --bin sui-rpc-benchmark -- direct \
    --db-url "postgres://postgres:postgres@localhost/gegao" \
    --concurrency 10 \
    --duration-secs 10

report with local DB


Total queries: 211772
Total errors: 0
Average latency: 0.46ms

Per-table statistics:
  obj_info                       queries: 33526    errors: 0        avg latency: 0.49ms
  ev_struct_inst                 queries: 31598    errors: 0        avg latency: 0.47ms
  tx_calls                       queries: 28752    errors: 0        avg latency: 0.45ms
  ev_emit_mod                    queries: 18321    errors: 0        avg latency: 0.44ms
  tx_affected_objects            queries: 11495    errors: 0        avg latency: 0.43ms
  tx_affected_addresses          queries: 11472    errors: 0        avg latency: 0.43ms
  sum_packages                   queries: 10759    errors: 0        avg latency: 0.55ms
  tx_kinds                       queries: 8140     errors: 0        avg latency: 0.43ms
  coin_balance_buckets           queries: 7542     errors: 0        avg latency: 0.45ms
  obj_versions                   queries: 7485     errors: 0        avg latency: 0.43ms
  kv_epoch_ends                  queries: 3750     errors: 0        avg latency: 0.45ms
  watermarks                     queries: 3742     errors: 0        avg latency: 0.44ms
  tx_digests                     queries: 3690     errors: 0        avg latency: 0.42ms
  tx_balance_changes             queries: 3662     errors: 0        avg latency: 0.42ms
  kv_checkpoints                 queries: 3619     errors: 0        avg latency: 0.43ms
  cp_sequence_numbers            queries: 3590     errors: 0        avg latency: 0.42ms
  kv_transactions                queries: 3463     errors: 0        avg latency: 0.43ms
  kv_protocol_configs            queries: 3451     errors: 0        avg latency: 0.44ms
  kv_epoch_starts                queries: 3448     errors: 0        avg latency: 0.71ms
  kv_genesis                     queries: 3424     errors: 0        avg latency: 0.42ms
  kv_objects                     queries: 3422     errors: 0        avg latency: 0.42ms
  kv_feature_flags               queries: 3421     errors: 0        avg latency: 0.42ms

Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.

For each box you select, include information after the relevant heading that describes the impact of your changes that a user might notice and any actions they must take to implement updates.

vercel · 2024-11-21T23:00:17Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
sui-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Feb 3, 2025 10:27pm

3 Skipped Deployments

Name	Status	Preview	Updated (UTC)
multisig-toolkit	⬜️ Ignored (Inspect)	Visit Preview	Feb 3, 2025 10:27pm
sui-kiosk	⬜️ Ignored (Inspect)	Visit Preview	Feb 3, 2025 10:27pm
sui-typescript-docs	⬜️ Ignored (Inspect)	Visit Preview	Feb 3, 2025 10:27pm

lxfind · 2024-11-25T20:24:33Z

generate template queries based on schema, specifically table primary key and indexes

What is the rational behind these generated queries? Would they be representative to what we are interested in?

gegaowp · 2024-11-25T20:33:10Z

@lxfind the generated queries now are basically "all supported queries based on primary key and indices" and does not consider the representativeness, I plan make it configurable for example taking in a file of weights to make it representative as a followup.

crates/sui-rpc-benchmark/src/direct/query_generator.rs

crates/sui-rpc-benchmark/src/direct/query_executor.rs

crates/sui-rpc-benchmark/src/direct/benchmark_config.rs

crates/sui-rpc-benchmark/src/direct/metrics.rs

crates/sui-rpc-benchmark/src/lib.rs

crates/sui-rpc-benchmark/src/direct/query_generator.rs

crates/sui-rpc-benchmark/src/direct/query_executor.rs

crates/sui-rpc-benchmark/src/direct/query_generator.rs

crates/sui-rpc-benchmark/src/lib.rs

wlmyng · 2025-01-29T20:31:37Z

crates/sui-rpc-benchmark/src/direct/benchmark_config.rs

+    /// Duration to run the benchmark in seconds
+    pub duration: Duration,


is this per query, or for the entire benchmark? Commenting before looking through rest of code so might become clearer, but still good to revisit and clarify

this is essentially the time-out of the while benchmark run, changed the field name to be more explicit.

crates/sui-rpc-benchmark/src/direct/metrics.rs

wlmyng · 2025-01-29T20:37:23Z

crates/sui-rpc-benchmark/src/direct/metrics.rs

+#[derive(Clone, Default)]
+pub struct MetricsCollector {
+    metrics: Arc<DashMap<String, QueryMetrics>>,
+}
+
+impl MetricsCollector {


if we envision MetricsCollector as an orchestrator of various benchmark reports, I wonder if we can do something similar to indexer alt framework, where record_query simply passes on received data to each BenchmarkResult struct, and generate_report just finishes each struct

MetricsCollector accumulates the benchmark records in the metrics map and generates one report out of the map, so not exactly orchestration -- what's the suggested alternative way and what's the advantage ?

wlmyng · 2025-01-29T20:47:34Z

crates/sui-rpc-benchmark/src/direct/query_generator.rs

+        let tables_query = r#"
+            SELECT tablename 
+            FROM pg_tables 
+            WHERE schemaname = 'public' 
+            AND tablename != '__diesel_schema_migrations'
+            ORDER BY tablename;
+        "#;
+        let tables: Vec<String> = client
+            .query(tables_query, &[])
+            .await?
+            .iter()
+            .map(|row| row.get::<_, String>(0))
+            .collect();
+        info!(
+            "Found {} active tables in database: {:?}",
+            tables.len(),
+            tables
+        );
+
+        let pk_query = r#"
+            SELECT tc.table_name, kcu.column_name
+            FROM information_schema.table_constraints tc
+            JOIN information_schema.key_column_usage kcu 
+                ON tc.constraint_name = kcu.constraint_name
+            WHERE tc.constraint_type = 'PRIMARY KEY'
+                AND tc.table_schema = 'public'
+                AND tc.table_name != '__diesel_schema_migrations'
+            ORDER BY tc.table_name, kcu.ordinal_position;
+        "#;


maybe these queries can be standalone

pk is essentially index + uniqueness, the queries run the same way today, any reason to split them?

crates/sui-rpc-benchmark/src/direct/query_generator.rs

wlmyng · 2025-01-29T20:55:46Z

crates/sui-rpc-benchmark/src/direct/query_generator.rs

+impl QueryGenerator {
+    async fn get_tables_and_indexes(&self) -> Result<Vec<BenchmarkQuery>, anyhow::Error> {


not sure if this needs to be a struct

what's the advantage of avoiding that?

wlmyng · 2025-01-29T20:56:24Z

crates/sui-rpc-benchmark/src/direct/query_executor.rs

+/// against the database. It can “enrich” each BenchmarkQuery by sampling real
+/// data from the relevant table. Each query’s execution is timed and recorded


Suggested change

/// against the database. It can “enrich” each BenchmarkQuery by sampling real

/// data from the relevant table. Each query’s execution is timed and recorded

/// against the database. It can “enrich” each BenchmarkQueryTemplate by sampling real

/// data from the relevant table. Each query’s execution is timed and recorded

The generated query templates are on pk and indexes, and need to be enriched with values sampled from the relevant table

it'd be great if we could create general sampling guidelines, i.e select some value that is most represented, rarest, etc.

re being presentative, makes sense and sounds further out, I would like to have a runnable tool and see how it can help us before polishing things.

crates/sui-rpc-benchmark/src/direct/query_executor.rs

wlmyng · 2025-01-29T21:01:58Z

crates/sui-rpc-benchmark/src/lib.rs

+            let query_generator = QueryGenerator {
+                db_url: db_url.clone(),
+            };
+            let benchmark_queries = query_generator.generate_benchmark_queries().await?;
+            info!("Generated {} benchmark queries", benchmark_queries.len());
+
+            let config = BenchmarkConfig {
+                concurrency,
+                duration: Duration::from_secs(duration_secs),
+            };
+
+            let mut query_executor = QueryExecutor::new(&db_url, benchmark_queries, config).await?;
+            let result = query_executor.run().await?;
+            info!("Total queries: {}", result.total_queries);


right, i was thinking instead we'd have a

generate template phase

enrich template with values sampled from db. or, inject values provided by user. user can set sampling technique

actual benchmarking

amnn

Thanks for adding the doc comments, they really help make sense of the design. Some high level thoughts:

It would be nice to have a clearer phase separation between doing the work to sample the data from the database, enriching queries, and then running the benchmarks. (I think the first two steps in particular could be separated more).
It would be very worthwhile to re-use prometheus for implementing various kinds of metric, maybe even exposing them over a metrics service (we have pulled out the metrics service we use in the indexer and RPC into its own crate now as well). This would allow us to focus on worrying about what data we want to track, and not on how to implement, e.g. a sampling histogram.

crates/sui-rpc-benchmark/src/direct/benchmark_config.rs

crates/sui-rpc-benchmark/src/direct/metrics.rs

amnn · 2025-01-31T12:32:48Z

crates/sui-rpc-benchmark/src/direct/metrics.rs

+                table_name,
+                queries: metrics.total_queries,
+                errors: metrics.errors,
+                avg_latency_ms: avg_latency,


It would be good to track percentiles as well as the mean (p50, p90, p99). Can we re-use prometheus for this?

conceptually it's good to also track percentiles and prometheus can come in handy, practically my intent here is to get a sense when DB is overwhelmed, and avg latency suffices for that purpose -- happy to extend that to percentiles if we find outputs of the tool useful and decide to add more granular data.

crates/sui-rpc-benchmark/src/direct/query_executor.rs

amnn · 2025-01-31T12:55:47Z

crates/sui-rpc-benchmark/src/direct/query_executor.rs

+            let params: Vec<Box<dyn ToSql + Sync + Send>> = row
+                .iter()
+                .map(|val| match val {
+                    SqlValue::Text(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                    SqlValue::Int4(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                    SqlValue::Int8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                    SqlValue::Float8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                    SqlValue::Bool(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                    SqlValue::Int2(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                    SqlValue::Bytea(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,
+                })
+                .collect();
+            let param_refs: Vec<&(dyn ToSql + Sync)> = params
+                .iter()
+                .map(|p| p.as_ref() as &(dyn ToSql + Sync))
+                .collect();
+
+            let query_str = enriched.query.query_template.clone();
+
+            let start = Instant::now();
+            let result = client.query(&query_str, &param_refs[..]).await;


The boxing step here should be unnecessary:

Suggested change

let params: Vec<Box<dyn ToSql + Sync + Send>> = row

.iter()

.map(|val| match val {

SqlValue::Text(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

SqlValue::Int4(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

SqlValue::Int8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

SqlValue::Float8(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

SqlValue::Bool(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

SqlValue::Int2(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

SqlValue::Bytea(v) => Box::new(v) as Box<dyn ToSql + Sync + Send>,

})

.collect();

let param_refs: Vec<&(dyn ToSql + Sync)> = params

.iter()

.map(|p| p.as_ref() as &(dyn ToSql + Sync))

.collect();

let query_str = enriched.query.query_template.clone();

let start = Instant::now();

let result = client.query(&query_str, &param_refs[..]).await;

let params: Vec<&dyn (ToSql + Sync)>> = row

.iter()

.map(|val| match val {

SqlValue::Text(v) => v,

SqlValue::Int4(v) => v,

SqlValue::Int8(v) => v,

SqlValue::Float8(v) => v,

SqlValue::Bool(v) => v,

SqlValue::Int2(v) => v,

SqlValue::Bytea(v) => v,

})

.collect();

let query_str = enriched.query.query_template.clone();

let start = Instant::now();

let result = client.query(&query_str, &params[..]).await;

it errors out if using & instead of boxing it

cargo build Compiling sui-rpc-benchmark v1.42.0 (/Users/gegao/Documents/sui/crates/sui-rpc-benchmark) error[E0308]: `match` arms have incompatible types --> crates/sui-rpc-benchmark/src/direct/query_executor.rs:160:42 | 158 | .map(|val| match val { | ____________________________- 159 | | SqlValue::Text(v) => v, | | - this is found to be of type `&Option<std::string::String>` 160 | | SqlValue::Int4(v) => v, | | ^ expected `&Option<String>`, found `&Option<i32>` 161 | | SqlValue::Int8(v) => v, ... | 165 | | SqlValue::Bytea(v) => v, 166 | | }) | |_________________- `match` arms have incompatible types | = note: expected reference `&Option<std::string::String>` found reference `&Option<i32>` For more information about this error, try `rustc --explain E0308`. error: could not compile `sui-rpc-benchmark` (lib) due to 1 previous error

after tweaking the suggested change to let params: Vec<&dyn ToSql + Sync> = row, just removing the un-matched > and un-necessary ()

amnn · 2025-01-31T12:56:27Z

crates/sui-rpc-benchmark/src/direct/query_executor.rs

+        if self.enriched_queries.is_empty() {
+            self.initialize_samples().await?;
+        }


Do we need to do this in a mutable way like this, or could we have a function that returns a set of enriched queries to run?

sure but it does not matter as initialize_samples is a single threaded prep step?

crates/sui-rpc-benchmark/src/direct/query_generator.rs

crates/sui-rpc-benchmark/src/lib.rs

…d executor

gegaowp requested a review from ronny-mysten as a code owner November 21, 2024 23:00

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:00 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs November 21, 2024 23:01 View deployment

gegaowp force-pushed the rpc-benchmark-init branch from 9c98761 to 1069b2c Compare November 21, 2024 23:08

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:08 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs November 21, 2024 23:09 View deployment

gegaowp force-pushed the rpc-benchmark-init branch from 1069b2c to 8b0818c Compare November 21, 2024 23:10

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:10 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs November 21, 2024 23:11 View deployment

gegaowp force-pushed the rpc-benchmark-init branch from 8b0818c to eb96893 Compare November 21, 2024 23:12

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 21, 2024 23:12 — with GitHub Actions Inactive

gegaowp requested review from amnn, lxfind, bmwill, emmazzz and wlmyng November 21, 2024 23:14

vercel bot deployed to Preview – sui-docs November 21, 2024 23:14 View deployment

gegaowp changed the title ~~rpc benchmark init: direct reading DB based on schema~~ rpc benchmark init: directly read DB based on schema Nov 21, 2024

lxfind reviewed Nov 25, 2024

View reviewed changes

crates/sui-rpc-benchmark/src/direct/query_generator.rs Outdated Show resolved Hide resolved

crates/sui-rpc-benchmark/src/direct/query_generator.rs Outdated Show resolved Hide resolved

crates/sui-rpc-benchmark/src/direct/query_executor.rs Outdated Show resolved Hide resolved

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env November 26, 2024 21:05 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs November 26, 2024 21:06 View deployment

gegaowp force-pushed the rpc-benchmark-init branch from 902755b to 43ab64b Compare January 8, 2025 10:17

gegaowp requested a review from lxfind January 8, 2025 10:18

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env January 8, 2025 10:18 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs January 8, 2025 10:19 View deployment

gegaowp force-pushed the rpc-benchmark-init branch from 43ab64b to 47dcc16 Compare January 8, 2025 10:26

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env January 8, 2025 10:26 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs January 8, 2025 10:27 View deployment

amnn reviewed Jan 14, 2025

View reviewed changes

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env January 15, 2025 09:19 — with GitHub Actions Inactive

gegaowp force-pushed the rpc-benchmark-init branch from 489ed78 to b2c4fac Compare January 15, 2025 09:20

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env January 15, 2025 09:20 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs January 15, 2025 09:23 View deployment

gegaowp requested a review from amnn January 16, 2025 04:23

gegaowp added 4 commits January 17, 2025 17:00

rpc benchmark: init commit of direct reading DB based on schema

0ce8889

address comments: migration path as arg and return values

6ff9397

refactor, add metrics and parallel run

3083263

address comments

570730d

wlmyng reviewed Jan 29, 2025

View reviewed changes

amnn reviewed Jan 31, 2025

View reviewed changes

comments

bb6c34c

gegaowp force-pushed the rpc-benchmark-init branch from b2c4fac to 5c0e3be Compare February 3, 2025 22:12

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env February 3, 2025 22:12 — with GitHub Actions Inactive

gegaowp requested review from wlmyng and amnn February 3, 2025 22:13

vercel bot deployed to Preview – sui-docs February 3, 2025 22:14 View deployment

gegaowp force-pushed the rpc-benchmark-init branch from 5c0e3be to d6648cc Compare February 3, 2025 22:15

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env February 3, 2025 22:15 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs February 3, 2025 22:17 View deployment

comments, including rafactor to split template generator, enricher an…

592ae3d

…d executor

gegaowp force-pushed the rpc-benchmark-init branch from d6648cc to 592ae3d Compare February 3, 2025 22:25

gegaowp temporarily deployed to sui-typescript-aws-kms-test-env February 3, 2025 22:25 — with GitHub Actions Inactive

vercel bot deployed to Preview – sui-docs February 3, 2025 22:27 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rpc benchmark init: directly read DB based on schema #20378

rpc benchmark init: directly read DB based on schema #20378

gegaowp commented Nov 21, 2024 •

edited

Loading

vercel bot commented Nov 21, 2024 •

edited

Loading

lxfind commented Nov 25, 2024

gegaowp commented Nov 25, 2024

wlmyng Jan 29, 2025

gegaowp Feb 3, 2025

wlmyng Jan 29, 2025

gegaowp Feb 3, 2025

wlmyng Jan 29, 2025

gegaowp Feb 3, 2025

wlmyng Jan 29, 2025

gegaowp Feb 3, 2025

wlmyng Jan 29, 2025

wlmyng Jan 29, 2025

gegaowp Feb 3, 2025

wlmyng Jan 29, 2025

amnn left a comment

amnn Jan 31, 2025

gegaowp Feb 3, 2025

amnn Jan 31, 2025

gegaowp Feb 3, 2025

amnn Jan 31, 2025

gegaowp Feb 3, 2025

		/// Duration to run the benchmark in seconds
		pub duration: Duration,

		impl QueryGenerator {
		async fn get_tables_and_indexes(&self) -> Result<Vec<BenchmarkQuery>, anyhow::Error> {

		/// against the database. It can “enrich” each BenchmarkQuery by sampling real
		/// data from the relevant table. Each query’s execution is timed and recorded

rpc benchmark init: directly read DB based on schema #20378

Are you sure you want to change the base?

rpc benchmark init: directly read DB based on schema #20378

Conversation

gegaowp commented Nov 21, 2024 • edited Loading

Description

Test plan

Release notes

vercel bot commented Nov 21, 2024 • edited Loading

lxfind commented Nov 25, 2024

gegaowp commented Nov 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amnn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gegaowp commented Nov 21, 2024 •

edited

Loading

vercel bot commented Nov 21, 2024 •

edited

Loading