-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VDiff/OnlineDDL: add support for running VDiffs for OnlineDDL migrations #15546
VDiff/OnlineDDL: add support for running VDiffs for OnlineDDL migrations #15546
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #15546 +/- ##
==========================================
- Coverage 68.38% 68.37% -0.02%
==========================================
Files 1556 1556
Lines 195028 195051 +23
==========================================
- Hits 133374 133370 -4
- Misses 61654 61681 +27 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main test TestOnlineDDLVDiff
is very clear and looks correct. There is a manual step where you decide that "it's now a good time to test":
waitForAdditionalRows(t, keyspace, "temp", 200)
want := &expectedVDiff2Result{
state: "completed",
minimumRowsCompared: 200,
hasMismatch: false,
shards: []string{"0"},
}
I wonder what a user would actually use in production. I want to suggest a more automated approach for production, which is to wait until ready_to_complete
column is 1
.
In fact, I suggest adding that to TestOnlineDDLVDiff
right after
waitForAdditionalRows(t, keyspace, "temp", 200)
Or am I misunderstanding and VDiff should start running as soon as the workflow does? In which case we can stop VDiff no earlier than ready_to_complete=1
(it should keep running while ready_to_complete=0
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! I only had minor nits that you can ignore or address as you prefer.
Side note, I think that adding this info somewhere on the VDiff reference page would be nice:
|
Signed-off-by: Rohit Nayak <[email protected]>
Signed-off-by: Rohit Nayak <[email protected]>
… results from Starting a Stopped workflow, for example Signed-off-by: Rohit Nayak <[email protected]>
8a0ad34
to
a08bb2f
Compare
…efore doing a vdiff on it Signed-off-by: Rohit Nayak <[email protected]>
Thanks for the suggestion: I added this check. |
The broader question is whether this should also be the behavior in production. |
You mean as an option as part of an Online DDL migration, or, when the user invokes VDiff separately for an ongoing Online DDL migration? Re: VDiff: users usually call VDiff at any time, mostly close to switching I guess, but also one or more times after the copy phase is done, while they let the workflow run for a period of time, to get confidence before they are ready to switch. |
That's the one I meant. Thanks for clarifying! |
Description
Currently VDiff errors out if they are run for a vreplication workflow initiated by OnlineDDL. This PR adds support for that.
Caveat
Note that only OnlineDDLs created as
vitess
with the--postpone-completions
option qualify to be validated by a VDiff. This is because the VReplication workflow needs to be in a Running state. VDiff needs to synchronize source and target streams so that it can expect to find identical snapshots on the source and the target. For this it will Stop/Start workflows.If a VDiff is already complete then the vreplication workflow is not expected to restart since the cutover has already happened.
Related Issue(s)
Checklist