-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
release-24.2: roachtest: fix rebalance/by-load/*/mixed-version shared process tests #135947
Open
blathers-crl
wants to merge
2
commits into
release-24.2
Choose a base branch
from
blathers/backport-release-24.2-131787
base: release-24.2
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+18
−0
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
`rebalance/by-load/*` roachtests query the internal timeseries database via http, specifically for the `cr.node.sys.cpu.combined.percent-normalized` metric, which represents the core-count normalized CPU utilization of each node within the cluster over the query window. The value should inherently be bounded in [0,100], as it would be impossible to use more than all the cores on a host, or use less than none of them. Assert as much. Informs: #129962 Informs: #131274 Informs: #129464 Release note: None
In #129117, `rebalance/by-load/*/mixed-version` roachtest had shared-process multi-tenancy introduced, which would occasionally cause these tests to fail erroneously. The cause of all the failures was identical, CPU utilization of some nodes which couldn't have been possible, > 100%, e.g., ``` CPU not evenly balanced after timeout: outside bounds mean=102.5 tolerance=20.0% (±20.5) bounds=[82.0, 123.0] below = [s3: 81 (-20.7%), s5: 65 (-36.5%)] within = [s2: 116 (+14.0%), s4: 92 (-9.7%), s6: 88 (-13.2%)] above = [s1: 170 (+66.1%)] ``` As the query would aggregate every tenant's timeseries data on a given node, instead of only the system tenant. Update the timeseries utility used to query the CPU to also take in a `TenantID` parameter, which is then used to query only the system tenant. Fixes: #129962 Fixes: #131274 Fixes: #129464 Release note: None
blathers-crl
bot
force-pushed
the
blathers/backport-release-24.2-131787
branch
from
November 21, 2024 21:48
66a15b1
to
e506c64
Compare
blathers-crl
bot
requested review from
herkolategan
and removed request for
a team
November 21, 2024 21:48
blathers-crl
bot
added
blathers-backport
This is a backport that Blathers created automatically.
O-robot
Originated from a bot.
labels
Nov 21, 2024
blathers-crl
bot
requested review from
arulajmani,
iskettaneh,
kvoli,
stevendanna and
tbg
November 21, 2024 21:48
Thanks for opening a backport. Please check the backport criteria before merging:
If your backport adds new functionality, please ensure that the following additional criteria are satisfied:
Also, please add a brief release justification to the body of your PR to justify this |
blathers-crl
bot
added
the
backport
Label PR's that are backports to older release branches
label
Nov 21, 2024
arulajmani
approved these changes
Nov 21, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
backport
Label PR's that are backports to older release branches
blathers-backport
This is a backport that Blathers created automatically.
O-robot
Originated from a bot.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Backport 2/2 commits from #131787 on behalf of @kvoli.
/cc @cockroachdb/release
In #129117,
rebalance/by-load/*/mixed-version
roachtest hadshared-process multi-tenancy introduced, which would occasionally cause
these tests to fail erroneously.
The cause of all the failures was identical, CPU utilization of some
nodes which couldn't have been possible, > 100%, e.g.,
As the query would aggregate every tenant's timeseries data on a given
node, instead of only the system tenant.
Update the timeseries utility used to query the CPU to also take in a
TenantID
parameter, which is then used to query only the systemtenant.
Fixes: #129962
Fixes: #131274
Fixes: #129464
Release note: None
Release justification: Test only.