Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abort on DDL in binlog stream #320

Closed
wants to merge 6 commits into from
Closed

Abort on DDL in binlog stream #320

wants to merge 6 commits into from

Conversation

insom
Copy link
Contributor

@insom insom commented Nov 25, 2021

related to #318

Co-authored-by: Pawan Dubey [email protected]

@insom insom requested a review from shuhaowu November 25, 2021 19:50
binlog_streamer.go Outdated Show resolved Hide resolved
insom and others added 2 commits December 2, 2021 14:37
Co-authored-by: Pawan Dubey <[email protected]>
Co-authored-by: Shiv Nagarajan <[email protected]>
Co-authored-by: Shiv Nagarajan <[email protected]>
insom and others added 4 commits December 3, 2021 13:50
@insom
Copy link
Contributor Author

insom commented Dec 3, 2021

Please report a bug if this causes problems.
Started with run options -n test_copy_data_with_alter_fails_part_way_through --seed 36007

TrivialIntegrationTests
  test_copy_data_with_alter_fails_part_way_through                PASS (2.02s)

Finished in 2.08223s
1 tests, 1 assertions, 0 failures, 0 errors, 0 skips

The ruby tests are failing CI due to timeouts, but I don't think it's because of changes we've introduced.

We've moved the integration test from the deprecated Go section to being written in Ruby. This causes the early termination of Ghostferry, which lets the test pass.

We've also rewritten things to use a smaller SQL parser (although it is still not a trivial dependency). Shiv and I have looked at the SQL parser landscape and a correct parser is simply going to be a large dependency (vs. doing something with regex which I strongly dislike).

In the future, now that we have a parser, I could see this being used to check if a DDL is happening to a table which is in the allow list / deny list, for example.

Alternatively, if we don't want to have this dependency at all, then I think action that's taken on QueryEvent could take a callback, and then other implementers who are using Ghostferry as a library could choose to do the parsing in their own code base?

@insom insom marked this pull request as ready for review December 3, 2021 19:26
@shuhaowu
Copy link
Contributor

shuhaowu commented Dec 3, 2021

I'm okay with this dependency for now. I would make the DDL event a callback instead of erroring right here. As you mentioned, we can pass the event based on the filters passed to Ghostferry as well, but we can do that in a follow up PR if you would like.

Copy link
Contributor

@shuhaowu shuhaowu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look pretty good. Some minor feedback on the tests and the code. We can merge after they're addressed. if you want, in a follow up, we can turn this into callbacks.

@@ -230,6 +231,22 @@ func (s *BinlogStreamer) Run() {
// the full query that was executed on the master (with annotations)
// that is otherwise not possible to reconstruct
query = ev.Event.(*replication.RowsQueryEvent).Query
case *replication.QueryEvent:
query = ev.Event.(*replication.QueryEvent).Query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a switch case structure with a type, so I think you can just do e.Query here?

@@ -230,6 +231,22 @@ func (s *BinlogStreamer) Run() {
// the full query that was executed on the master (with annotations)
// that is otherwise not possible to reconstruct
query = ev.Event.(*replication.RowsQueryEvent).Query
case *replication.QueryEvent:
query = ev.Event.(*replication.QueryEvent).Query
if string(query) == "BEGIN" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you leave a comment on why this is ignored here? It's not super obvious to me.

if string(query) == "BEGIN" {
break
}
statement, err := sqlparser.Parse(string(query))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: I see string(query) twice here. Can we modify the above line to simply go: query := string(e.Query)?

ghostferry = new_ghostferry(MINIMAL_GHOSTFERRY)

ghostferry.on_status(GhostferryHelper::Ghostferry::Status::BINLOG_STREAMING_STARTED) do
datawriter.start
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datawriter concept is mostly for when you want a background thread hitting MySQL in the test.

In this case, I think you can just go directly to:

Suggested change
datawriter.start
source_db.query(....)
source_db.query(....)

See types_test for more examples of this. This should save you from using another class.

source_count = source[DEFAULT_FULL_TABLE_NAME][:row_count]
target_count = target[DEFAULT_FULL_TABLE_NAME][:row_count]

refute_equal(source_count, target_count, "target should have fewer rows than source")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ghostferry.stderr should also give you the log messages, if you want to match the log message here. I just realized that nothing actually uses it so maybe what you have is fine 🤷

@insom insom closed this Aug 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants