-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize with IN Clause for UPDATE/DELETE Statements on Vindexes #15455
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #15455 +/- ##
==========================================
+ Coverage 65.64% 65.76% +0.11%
==========================================
Files 1563 1561 -2
Lines 194389 194884 +495
==========================================
+ Hits 127602 128157 +555
+ Misses 66787 66727 -60 ☔ View full report in Codecov by Sentry. |
Signed-off-by: wangweicugw <[email protected]>
Signed-off-by: wangweicugw <[email protected]>
Hello! 👋 This Pull Request is now handled by arewefastyet. The current HEAD and future commits will be benchmarked. You can find the performance comparison on the arewefastyet website. |
@wangweicugw Nice stuff! Did you check that we have relevant end-to-end tests for this? |
I haven't found relevant end-to-end tests for this; I will supplement them in go/test/endtoend/vtgate/queries/dml/dml_test.go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could not find any test that shows the bind variables are split across different shards for __vals
you should also validate this in a unit test that shows what bind variables were sent down. |
Signed-off-by: wangweicugw <[email protected]>
Signed-off-by: wangweicugw <[email protected]>
Signed-off-by: wangweicugw <[email protected]>
I have added some relevant test cases here. |
@systay @harshit-gangal |
This is an enhancement PR therefore it does not qualify for backport |
As this is an optimization PR, I think we should add benchmark showing if this reduces any memory allocations |
Alright, we are still using version 18.0. |
Signed-off-by: wangweicugw <[email protected]>
I have attempted to add some benchmark tests, could you help me review them? |
Description
When I run a SELECT statement using an IN clause with the Vindexes, the query is correctly split according to the corresponding values in the IN list. This ensures that each shard is queried separately, which is the expected behavior for optimized shard querying.
However, when I perform an UPDATE/DELETE statement with an IN clause on the Vindexes, the SQL is not rewritten in the same manner. Instead, the query is sent directly to the shard without being split based on the IN clause values.
I hope to have consistent behavior with SELECT when processing UPDATE/DELETE statements with an IN clause on the Vindexes.
I would like the following UPDATE statement:
to be rewritten into separate statements, one for each value in the IN clause, like so:
This would ensure that the UPDATE/DELETE operation is as efficient as possible, especially for large batch updates, by targeting only the relevant shards.
Related Issue(s)
Checklist