improvement(perf): add validation rules for latency decorator #9295

soyacz · 2024-11-20T10:12:04Z

Added validation rules for results sent by
latency_calculator_decorator to Argus.
Each workload and result name (nemesis, predefined step) may set own rules.

Current rules were created based on existing results - to pass typical good results.

closes: #9237

Testing

PR pre-checks (self review)

I added the relevant backport labels
I didn't leave commented-out/debugging code

Reminders

Add New configuration option and document them (in sdcm/sct_config.py)
Add unit tests to cover my changes (under unit-test/ folder)
Update the Readme/doc folder relevant to this change (if needed)

sdcm/argus_results.py

soyacz · 2024-11-21T16:16:11Z

@fruch I adjusted the code to configure it by config file. Converted to draft as I didn't add error thresholds for OSS yet.

sdcm/sct_config.py

defaults/test_default.yaml

configurations/performance/latency-decorator-error-thresholds-nemesis-ent.yaml

fruch · 2024-11-25T09:17:31Z

we are missing a configuration for the the upgrade cases

configurations/performance/latency-decorator-error-thresholds-steps-ent.yaml

soyacz · 2024-11-26T09:32:44Z

I verified predefined steps test (with null'ed validation rules for latencies in unthrottled) - all seem to work (except one small issue with Argus: https://argus.scylladb.com/tests/scylla-cluster-tests/9e2af03d-b1a5-4df0-b516-4ce5e624586d)
Besides, due #9294 I think of taking out default throughput verifications - until this one is closed (not to generate false errors).
I'll prepare this PR for final review (and construct rules for tablets scenarios).
Defaults should be good for upgrade, until we want to verify durations for these - @fruch @juliayakovlev let me know if I should add it).

fruch · 2024-11-26T09:45:37Z

test-cases/performance/perf-regression-latency-650gb-with-nemesis.yaml

@@ -1,12 +1,12 @@
 test_duration: 3000
-prepare_write_cmd: ["cassandra-stress write no-warmup cl=ALL n=162500000 -schema 'replication(strategy=NetworkTopologyStrategy,replication_factor=3)' -mode cql3 native -rate threads=200 -col 'size=FIXED(128) n=FIXED(8)' -pop seq=1..162500000",


reminder, taking this out

soyacz · 2024-11-26T10:33:13Z

@fruch @juliayakovlev I think it's ready for review. All duration/latency error thresholds I based on graphs - mostly to make them passing. I think fine tuning them may be done later on perf weekly meetings, when graphs show them.

fruch

LGTM

but let's wait for @roydahan and @juliayakovlev to cross check the figures

soyacz · 2024-12-01T13:45:15Z

@roydahan @juliayakovlev ^^

juliayakovlev · 2024-12-01T17:34:25Z

configurations/performance/latency-decorator-error-thresholds-steps-ent-tablets.yaml

@@ -0,0 +1,98 @@
+latency_decorator_error_thresholds:
+  write:


First step in write test is 200000

juliayakovlev · 2024-12-01T18:31:45Z

configurations/performance/latency-decorator-error-thresholds-steps-ent-vnodes.yaml

@@ -0,0 +1,98 @@
+latency_decorator_error_thresholds:
+  write:


First step in write test is 200000

juliayakovlev

small comments

soyacz · 2024-12-02T12:43:00Z

small comments

generally, if something is not provided then defaults are used. But I added it for clarity.
What I'm most concerned if my values are correct - if I wasn't too delicate for scylla and bars should be marked lower at some places (especially for durations).

fruch · 2024-12-03T15:50:22Z

@soyacz

you have a small conflict here

Added validation rules for results sent by `latency_calculator_decorator` to Argus. Each workload and result name (nemesis, predefined step) may set own rules. Current rules were created based on existing results - to pass typical good results. closes: scylladb#9237

soyacz · 2024-12-03T15:54:34Z

@soyacz

you have a small conflict here

fixed

roydahan · 2024-12-03T20:53:56Z

configurations/performance/latency-decorator-error-thresholds-steps-ent-vnodes.yaml

+      P90 write:
+        fixed_limit: 1000
+      P99 write:
+        fixed_limit: 1000


What units are these numbers?
What is 1000? ms?

Isn't it way too high?

Let's sit together and define the numbers we want to define here.

soyacz requested review from fruch, roydahan and juliayakovlev November 20, 2024 10:12

github-actions bot assigned soyacz Nov 20, 2024

fruch reviewed Nov 20, 2024

View reviewed changes

sdcm/argus_results.py Outdated Show resolved Hide resolved

fruch reviewed Nov 20, 2024

View reviewed changes

sdcm/argus_results.py Outdated Show resolved Hide resolved

soyacz force-pushed the add-limits-to-latency-decorator branch from 09a3d76 to bfbe9d1 Compare November 21, 2024 16:10

soyacz marked this pull request as draft November 21, 2024 16:14

soyacz added backport/perf-v15 backport/perf-v16 labels Nov 21, 2024

fruch reviewed Nov 21, 2024

View reviewed changes

sdcm/sct_config.py Outdated Show resolved Hide resolved

fruch reviewed Nov 21, 2024

View reviewed changes

defaults/test_default.yaml Outdated Show resolved Hide resolved

fruch reviewed Nov 21, 2024

View reviewed changes

configurations/performance/latency-decorator-error-thresholds-nemesis-ent.yaml Outdated Show resolved Hide resolved

soyacz force-pushed the add-limits-to-latency-decorator branch from bfbe9d1 to 2530138 Compare November 22, 2024 09:18

fruch reviewed Nov 25, 2024

View reviewed changes

configurations/performance/latency-decorator-error-thresholds-steps-ent.yaml Outdated Show resolved Hide resolved

soyacz force-pushed the add-limits-to-latency-decorator branch 2 times, most recently from 733c652 to c0a7676 Compare November 25, 2024 17:53

fruch reviewed Nov 26, 2024

View reviewed changes

soyacz force-pushed the add-limits-to-latency-decorator branch 3 times, most recently from 588f762 to 5e07743 Compare November 26, 2024 10:25

soyacz marked this pull request as ready for review November 26, 2024 10:28

soyacz force-pushed the add-limits-to-latency-decorator branch from 5e07743 to 993c44c Compare November 26, 2024 15:49

fruch previously approved these changes Nov 27, 2024

View reviewed changes

juliayakovlev reviewed Dec 1, 2024

View reviewed changes

juliayakovlev requested changes Dec 1, 2024

View reviewed changes

soyacz dismissed fruch’s stale review via 8a72dfa December 2, 2024 12:40

soyacz force-pushed the add-limits-to-latency-decorator branch 2 times, most recently from 8a72dfa to 4adf310 Compare December 2, 2024 12:41

soyacz requested a review from juliayakovlev December 3, 2024 14:09

soyacz force-pushed the add-limits-to-latency-decorator branch from 4adf310 to ca452ea Compare December 3, 2024 15:54

roydahan reviewed Dec 3, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvement(perf): add validation rules for latency decorator #9295

improvement(perf): add validation rules for latency decorator #9295

soyacz commented Nov 20, 2024 •

edited

Loading

soyacz commented Nov 21, 2024

fruch commented Nov 25, 2024

soyacz commented Nov 26, 2024

fruch Nov 26, 2024

soyacz commented Nov 26, 2024

fruch left a comment

soyacz commented Dec 1, 2024

juliayakovlev Dec 1, 2024

juliayakovlev Dec 1, 2024

juliayakovlev left a comment

soyacz commented Dec 2, 2024

fruch commented Dec 3, 2024

soyacz commented Dec 3, 2024

roydahan Dec 3, 2024

soyacz Dec 4, 2024

roydahan Dec 4, 2024

roydahan Dec 9, 2024

		@@ -1,12 +1,12 @@
		test_duration: 3000
		prepare_write_cmd: ["cassandra-stress write no-warmup cl=ALL n=162500000 -schema 'replication(strategy=NetworkTopologyStrategy,replication_factor=3)' -mode cql3 native -rate threads=200 -col 'size=FIXED(128) n=FIXED(8)' -pop seq=1..162500000",

improvement(perf): add validation rules for latency decorator #9295

Are you sure you want to change the base?

improvement(perf): add validation rules for latency decorator #9295

Conversation

soyacz commented Nov 20, 2024 • edited Loading

Testing

PR pre-checks (self review)

Reminders

soyacz commented Nov 21, 2024

fruch commented Nov 25, 2024

soyacz commented Nov 26, 2024

fruch Nov 26, 2024

Choose a reason for hiding this comment

soyacz commented Nov 26, 2024

fruch left a comment

Choose a reason for hiding this comment

soyacz commented Dec 1, 2024

juliayakovlev Dec 1, 2024

Choose a reason for hiding this comment

juliayakovlev Dec 1, 2024

Choose a reason for hiding this comment

juliayakovlev left a comment

Choose a reason for hiding this comment

soyacz commented Dec 2, 2024

fruch commented Dec 3, 2024

soyacz commented Dec 3, 2024

roydahan Dec 3, 2024

Choose a reason for hiding this comment

soyacz Dec 4, 2024

Choose a reason for hiding this comment

roydahan Dec 4, 2024

Choose a reason for hiding this comment

roydahan Dec 9, 2024

Choose a reason for hiding this comment

soyacz commented Nov 20, 2024 •

edited

Loading