-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KS-354] Load test for Streams Trigger #13643
Conversation
7697725
to
1b6fcc1
Compare
1b6fcc1
to
5704c19
Compare
// Load test that combines Trigger Subscriber, Streams Trigger Aggregator and Streams Codec. | ||
// It measures time needed to receive and process trigger events from multiple nodes and produce a local aggregated event. | ||
// For more meaningful measurements, increase the values of parameters P and T. | ||
func TestStreamsTrigger_Load(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how do we run and monitor load tests in ci? this won't be run for PR right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameters default to very low values (P=2, T=2) so the test takes less than 100ms to run. The "load" part of it won't be exercised in CI but it still tests integration of certain components and validates outputs so I think there is value of having it executed on every PR.
In terms of actual load testing, so far I've only done manual runs with increased params to measure time and profile it. I'm open to suggestions for how to automate it. Maybe we could have a catch-all test that is disabled in CI but runs all load test with high params? (still to be run manually).
Overall I'm skeptical about running load tests in CI. They waste resources and CI env likely doesn't have consistent perf (in fact good CI should vary CPU load on purpose to catch flakes).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i'm also skeptical, which is the reason i asked.
it would be ideal to have nightly perf tests in a stable environment that we could monitor for trends.
the test name is misleading given that this test can't be run in an automated way to find load related problem, perhaps integration
or sanity
are better than load
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to simply "TestStreamsTrigger"
// send and process all trigger events | ||
startTs := time.Now().UnixMilli() | ||
processingTime := int64(0) | ||
for c := 0; c < T; c++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this log any status or progress? i image this test needs to be run with some sort of timeout. if it runs for 5 minutes and the test ctx is cancelled, will you know how far this got relative to the total work to be done?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far I've only relied on the log from line 137 below. I doubt there will ever be a need to run it for periods longer than test framework timeouts. If test fails for any reason then we won't get the log summary (or hit an assert) and that won't produce meaningful results.
5704c19
to
3e5dde7
Compare
decoding + validation + aggregation
3e5dde7
to
6bcd0bc
Compare
Quality Gate passedIssues Measures |
for j := 0; j < P; j++ { // ... sends reports for every feed ... | ||
reportIdx := (i + j) % R | ||
signatures := make([][]byte, F+1) | ||
for k := 0; k < F+1; k++ { // ... each signed by F+1 nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to profile failure scenarios? Or are we good with just happy path load testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're good. Failures shouldn't be heavier than successful ones.
Although that makes me think that we might need some extra protections agains malicious nodes trying to attach too many signatures - I might add something in a separate PR, thanks!
decoding + validation + aggregation
https://smartcontract-it.atlassian.net/browse/KS-354