-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Throttler multi-metrics: proto changes #16040
Throttler multi-metrics: proto changes #16040
Conversation
Signed-off-by: Shlomi Noach <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #16040 +/- ##
==========================================
- Coverage 68.56% 68.55% -0.02%
==========================================
Files 1544 1544
Lines 197873 197885 +12
==========================================
- Hits 135670 135653 -17
- Misses 62203 62232 +29 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Shlomi Noach <[email protected]>
Signed-off-by: Shlomi Noach <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Only had a couple of minor comments/questions.
bool ok_if_not_exists = 4; | ||
// MultiMetricsEnabled is always set to "true" and is how a multi-metrics enabled replica | ||
// throttler knows its being probed by a multi-metrics enabled primary vttablet. | ||
bool multi_metrics_enabled = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's true, why do we need it? The zero value will be false so we'll have to be sure we're setting it to true everywhere. When should it be false? Is it related to backwards compatibility, or mixed tablet versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it related to backwards compatibility,
Yes! As per the comment to this variable which I could probably rephrase better.
... and is how a multi-metrics enabled replica
// throttler knows its being probed by a multi-metrics enabled primary vttablet.
Meaning a "new" replica would respond like an "old" replica would, if this boolean is false
, because that indicates an "old" primary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag will be completely removed in v22.
// Scope used in this check | ||
string scope = 7; | ||
} | ||
map<string, Metric> metrics = 7; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the key here? Can you please add a comment on that? Maybe it's the tablet UID or something (the larger PR probably answers this). If it's name, however, then we could remove name from the Metric message itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments in code.
// Metrics is a map (metric name -> metric value/error) so that the client has as much
// information as possible about all the checked metrics.
message GetThrottlerStatusRequest { | ||
} | ||
|
||
message GetThrottlerStatusResponse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! So I think this is a migration of the HTTP status endpoint to an RPC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct!
vttime.Time last_healthy_at = 1; | ||
int64 seconds_since_last_healthy = 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a little redundant to me, but fine as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's redundant -- but great for metrics! Which is why I added it. This represents really well as a gauge.
Signed-off-by: Shlomi Noach <[email protected]>
Change of plans. By popular demand, rebasing this against |
Signed-off-by: Shlomi Noach <[email protected]>
Rebased on |
Signed-off-by: Shlomi Noach <[email protected]>
Description
This PR is the first in the series of incremental changes breaking down #16039 into smaller parts.
We begin by applying the
proto
changes.Notable:
CheckThrottler
gRPC call already exists inv19
and before, and is/was used by the Primary throttler to probe its replicas, and by the replica throttler to notify the Primary about recent checks.CheckThrottlerRequest
andCheckThrottlerResponse
with more fields. We will later makeCheckThrottler
avtctidlclient
command. A check requests identifies by app name, and now also clearly indicates whether multi-metrics are supported/desired (important for backwards compatibility and cross-version interaction), as well as other flags that will be come clearer later on.GetThrottlerStatus
,GetThrottlerStatusRequest
,GetThrottlerStatusResponse
are new calls, and useful for debugging/incident analysis. These return the equivalent of/throttler/status
, but now via gRPC instead of HTTP. Likewise, this will be implemented as avtctldclient
in a followup PR.ThrottlerConfig
now supports:AppCheckedMetrics
- mapping of app->metrics (meaning which metrics the throttler checks on behalf of given app)MetricThresholds
- per metric threshold (different threshold forlag
, forthreads_running
, etc.)See the docs in #16039 main comment for more.
This PR merges into
vitessio:throttler-multi-metrics-incremental
, which is a Draft PR, and not intomain
.https://github.com/planetscale/vitess/compare/vitessio:vitess:throttler-multi-metrics-incremental...planetscale:vitess:throttler-multi-metrics-incremental-proto?expand=1
Related Issue(s)
#15988
Checklist
Deployment Notes