-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vtorc
: fetch all tablets from cells once + filter during refresh
#17388
vtorc
: fetch all tablets from cells once + filter during refresh
#17388
Conversation
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
vtorc
: fetch all tablets from cells once and filtervtorc
: fetch all tablets from cells once + filter during refresh
Signed-off-by: Tim Vaillancourt <[email protected]>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #17388 +/- ##
==========================================
- Coverage 67.71% 67.69% -0.02%
==========================================
Files 1584 1584
Lines 254519 254541 +22
==========================================
- Hits 172338 172316 -22
- Misses 82181 82225 +44 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
This introduces a problem if shards are added/removed. Moving back to draft |
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
This PR is being marked as stale because it has been open for 30 days with no activity. To rectify, you may do any of the following:
If no action is taken within 7 days, this PR will be closed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Signed-off-by: Tim Vaillancourt <[email protected]>
Signed-off-by: Tim Vaillancourt <[email protected]>
…itessio#17388) Signed-off-by: Tim Vaillancourt <[email protected]> Signed-off-by: garfthoffman <[email protected]>
…itessio#17388) Signed-off-by: Tim Vaillancourt <[email protected]>
* Ensure all topo read calls consider `--topo_read_concurrency` (vitessio#17276) Signed-off-by: Tim Vaillancourt <[email protected]> * Revert "add keyrange support for vtorc clusters_to_watch (#457)" This reverts commit 45c2199. * [release-19.0] `vtorc`: require topo for `Healthy: true` in `/debug/health` (vitessio#17129) (vitessio#17351) Signed-off-by: Tim Vaillancourt <[email protected]> Signed-off-by: Manan Gupta <[email protected]> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Tim Vaillancourt <[email protected]> Co-authored-by: Manan Gupta <[email protected]> * `vtorc`: fetch all tablets from cells once + filter during refresh (vitessio#17388) Signed-off-by: Tim Vaillancourt <[email protected]> * Support KeyRange in `--clusters_to_watch` flag (vitessio#17604) Signed-off-by: Manan Gupta <[email protected]> * missing func Signed-off-by: Tim Vaillancourt <[email protected]> * Add api end point to print the current database state in VTOrc (vitessio#15485) Signed-off-by: Manan Gupta <[email protected]> --------- Signed-off-by: Tim Vaillancourt <[email protected]> Signed-off-by: Manan Gupta <[email protected]> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Manan Gupta <[email protected]> Co-authored-by: Manan Gupta <[email protected]>
…itessio#17388) Signed-off-by: Tim Vaillancourt <[email protected]>
* Move to native sqlite3 queries (vitessio#17124) Signed-off-by: Dirkjan Bussink <[email protected]> Co-authored-by: Tim Vaillancourt <[email protected]> * Improve efficiency of `vtorc` topo calls (vitessio#17071) Signed-off-by: Tim Vaillancourt <[email protected]> Co-authored-by: Matt Lord <[email protected]> * Ensure all topo read calls consider `--topo_read_concurrency` (vitessio#17276) Signed-off-by: Tim Vaillancourt <[email protected]> * Avoid flaky topo concurrency test (vitessio#17407) Signed-off-by: Tim Vaillancourt <[email protected]> * `vtorc`: fetch all tablets from cells once + filter during refresh (vitessio#17388) Signed-off-by: Tim Vaillancourt <[email protected]> * Support KeyRange in `--clusters_to_watch` flag (vitessio#17604) Signed-off-by: Manan Gupta <[email protected]> * `vtorc`: improve handling of partial cell topo results (vitessio#17718) Signed-off-by: Tim Vaillancourt <[email protected]> * Add stats for shards watched by VTOrc Signed-off-by: Tim Vaillancourt <[email protected]> * add more tests Signed-off-by: Tim Vaillancourt <[email protected]> * cleanup Signed-off-by: Tim Vaillancourt <[email protected]> * fix ineffassign Signed-off-by: Tim Vaillancourt <[email protected]> * fix test for v21 Signed-off-by: Tim Vaillancourt <[email protected]> * Use prefix in all vtorc check and recover logs (vitessio#17526) Signed-off-by: Eduardo J. Ortega U. <[email protected]> --------- Signed-off-by: Dirkjan Bussink <[email protected]> Signed-off-by: Tim Vaillancourt <[email protected]> Signed-off-by: Manan Gupta <[email protected]> Signed-off-by: Eduardo J. Ortega U. <[email protected]> Co-authored-by: Dirkjan Bussink <[email protected]> Co-authored-by: Matt Lord <[email protected]> Co-authored-by: Manan Gupta <[email protected]> Co-authored-by: Eduardo J. Ortega U. <[email protected]>
Description
This PR changes VTOrc tablet discovery to fetch all tablets once and filter them client side
Today, when
--clusters_to_watch
is set, VTOrc will callrefreshTabletsInKeyspaceShard(...)
in a goroutine for every shard when refreshing tablets at the interval--topo-information-refresh-duration
Unfortunately calling
refreshTabletsInKeyspaceShard(...)
is very inefficient because behind the scenes it is just loading all tablets from all cells because fetching just 1-shard or keyspace from the topo store isn't possible without fetching "all" 😱This means that today a single refresh of all tablets causes
shards x # tablets
to be read instead of just# tablets
, and much of the data read is just duplicated. Instead this PR causes only# of tablets
to be fetched each tickRelated Issue(s)
Tracking issue: #17330
Checklist
Deployment Notes