change AndExpr to contain arbitrary many predicates #16671

systay · 2024-08-28T05:59:27Z

Description

During planning, we often want to have predicates as a slice and not in a tree structure. This PR changes the AndExpr to contain a slice if inner predicates instead of Left/Right fields.

This PR also moves the checking for constant values in predicates to the planning stage. No need to do these rewrites if the whole query will be sent to MySQL.

Summary of Performance Changes:

The recent changes introduced in this PR have resulted in mixed performance outcomes across various benchmarks:

Plan Time (sec/op):

There is a slight overall increase in execution time with a geometric mean change of +0.48%.
The OLTP/Gen4 series saw notable increases in execution time, ranging from +18.33% to +23.20%.
Conversely, the TPCC/Gen4 series experienced performance improvements, with execution times decreasing by up to -6.52%.
The Planner benchmarks mostly saw increases in execution time, with the largest increase of +8.41% in Planner/from_cases.json-gen4-20.

Memory Usage (B/op):

Overall memory usage decreased slightly, with a geometric mean reduction of -2.18%.
Significant reductions were observed in Planner/filter_cases.json-gen4 and Planner/filter_cases.json-gen4left2right, both decreasing by over -20%.
However, memory usage slightly increased in the OLTP/Gen4 series by approximately +3.6%.

Memory Allocations (allocs/op):

The total number of memory allocations showed a minor decrease of -0.65%.
Notable increases in allocations occurred in the SelectVsDML/DML_(random_sample,_N=32) benchmark (+11.34%).
However, the SelectVsDML/Select_(random_sample,_N=32) benchmark saw a significant reduction in allocations by -17.33%.

These results reflect targeted optimizations and regressions across different query plans and scenarios, highlighting areas where the changes have improved or slightly degraded performance.

Details:

                                            │  ../before  │                after                 │
                                            │   sec/op    │    sec/op     vs base                │
OLTP/Gen4-20                                  221.3µ ± 1%    261.9µ ± 1%  +18.33% (p=0.000 n=10)
OLTP/Gen4Greedy-20                            220.9µ ± 0%    270.1µ ± 3%  +22.26% (p=0.000 n=10)
OLTP/Gen4Left2Right-20                        220.1µ ± 2%    271.2µ ± 1%  +23.20% (p=0.000 n=10)
TPCC/Gen4-20                                  2.259m ± 1%    2.189m ± 1%   -3.12% (p=0.000 n=10)
TPCC/Gen4Greedy-20                            2.255m ± 0%    2.170m ± 2%   -3.77% (p=0.000 n=10)
TPCC/Gen4Left2Right-20                        2.225m ± 0%    2.080m ± 1%   -6.52% (p=0.000 n=10)
TPCH/Gen4-20                                  11.16m ± 1%    11.18m ± 1%        ~ (p=0.143 n=10)
TPCH/Gen4Greedy-20                            11.17m ± 0%    11.27m ± 1%        ~ (p=0.353 n=10)
TPCH/Gen4Left2Right-20                        9.988m ± 0%   10.022m ± 1%        ~ (p=0.739 n=10)
Planner/from_cases.json-gen4-20               7.336m ± 1%    7.953m ± 2%   +8.41% (p=0.000 n=10)
Planner/from_cases.json-gen4left2right-20     6.876m ± 1%    7.376m ± 2%   +7.26% (p=0.000 n=10)
Planner/filter_cases.json-gen4-20             181.8m ± 0%    155.2m ± 1%  -14.61% (p=0.000 n=10)
Planner/filter_cases.json-gen4left2right-20   181.9m ± 0%    155.0m ± 1%  -14.82% (p=0.000 n=10)
Planner/large_cases.json-gen4-20              484.6µ ± 1%    500.1µ ± 1%   +3.20% (p=0.000 n=10)
Planner/large_cases.json-gen4left2right-20    301.2µ ± 1%    306.7µ ± 1%   +1.80% (p=0.002 n=10)
Planner/aggr_cases.json-gen4-20               13.04m ± 0%    14.09m ± 1%   +8.13% (p=0.000 n=10)
Planner/aggr_cases.json-gen4left2right-20     12.46m ± 1%    13.42m ± 1%   +7.68% (p=0.000 n=10)
Planner/select_cases.json-gen4-20             10.39m ± 1%    11.17m ± 1%   +7.52% (p=0.000 n=10)
Planner/select_cases.json-gen4left2right-20   10.13m ± 2%    10.91m ± 1%   +7.66% (p=0.000 n=10)
Planner/union_cases.json-gen4-20              3.732m ± 6%    4.009m ± 1%   +7.42% (p=0.000 n=10)
Planner/union_cases.json-gen4left2right-20    3.710m ± 2%    3.990m ± 1%   +7.55% (p=0.000 n=10)
SemAnalysis-20                                62.98m ± 3%    29.67m ± 1%  -52.89% (p=0.000 n=10)
SelectVsDML/DML_(random_sample,_N=32)-20      946.7µ ± 2%   1093.3µ ± 1%  +15.49% (p=0.000 n=10)
SelectVsDML/Select_(random_sample,_N=32)-20   1.948m ± 5%    1.843m ± 1%   -5.40% (p=0.000 n=10)
geomean                                       4.220m         4.240m        +0.48%

                                            │   ../before   │                after                 │
                                            │     B/op      │     B/op      vs base                │
OLTP/Gen4-20                                   144.9Ki ± 0%   150.1Ki ± 0%   +3.59% (p=0.000 n=10)
OLTP/Gen4Greedy-20                             144.9Ki ± 0%   150.1Ki ± 0%   +3.58% (p=0.000 n=10)
OLTP/Gen4Left2Right-20                         144.9Ki ± 0%   150.1Ki ± 0%   +3.58% (p=0.000 n=10)
TPCC/Gen4-20                                   1.056Mi ± 0%   1.066Mi ± 0%   +0.96% (p=0.000 n=10)
TPCC/Gen4Greedy-20                             1.056Mi ± 0%   1.066Mi ± 0%   +0.96% (p=0.000 n=10)
TPCC/Gen4Left2Right-20                         1.045Mi ± 0%   1.055Mi ± 0%   +0.95% (p=0.000 n=10)
TPCH/Gen4-20                                   5.328Mi ± 0%   5.290Mi ± 0%   -0.70% (p=0.000 n=10)
TPCH/Gen4Greedy-20                             5.328Mi ± 0%   5.289Mi ± 0%   -0.72% (p=0.000 n=10)
TPCH/Gen4Left2Right-20                         4.806Mi ± 0%   4.775Mi ± 0%   -0.64% (p=0.000 n=10)
Planner/from_cases.json-gen4-20                4.503Mi ± 0%   4.521Mi ± 0%   +0.40% (p=0.000 n=10)
Planner/from_cases.json-gen4left2right-20      4.314Mi ± 0%   4.332Mi ± 0%   +0.43% (p=0.000 n=10)
Planner/filter_cases.json-gen4-20              22.39Mi ± 0%   17.75Mi ± 0%  -20.72% (p=0.000 n=10)
Planner/filter_cases.json-gen4left2right-20    22.31Mi ± 0%   17.68Mi ± 0%  -20.76% (p=0.000 n=10)
Planner/large_cases.json-gen4-20               268.9Ki ± 0%   268.6Ki ± 0%   -0.12% (p=0.000 n=10)
Planner/large_cases.json-gen4left2right-20     174.7Ki ± 0%   174.4Ki ± 0%   -0.21% (p=0.000 n=10)
Planner/aggr_cases.json-gen4-20                7.183Mi ± 0%   7.215Mi ± 0%   +0.44% (p=0.000 n=10)
Planner/aggr_cases.json-gen4left2right-20      7.001Mi ± 0%   7.033Mi ± 0%   +0.45% (p=0.000 n=10)
Planner/select_cases.json-gen4-20              6.063Mi ± 0%   6.114Mi ± 0%   +0.84% (p=0.000 n=10)
Planner/select_cases.json-gen4left2right-20    5.958Mi ± 0%   6.010Mi ± 0%   +0.86% (p=0.000 n=10)
Planner/union_cases.json-gen4-20               2.215Mi ± 0%   2.230Mi ± 0%   +0.68% (p=0.000 n=10)
Planner/union_cases.json-gen4left2right-20     2.211Mi ± 0%   2.225Mi ± 0%   +0.66% (p=0.000 n=10)
SelectVsDML/DML_(random_sample,_N=32)-20       672.3Ki ± 0%   653.8Ki ± 0%   -2.75% (p=0.000 n=10)
SelectVsDML/Select_(random_sample,_N=32)-20   1155.5Ki ± 0%   973.9Ki ± 0%  -15.72% (p=0.000 n=10)
geomean                                        1.844Mi        1.804Mi        -2.18%

                                            │  ../before  │                after                │
                                            │  allocs/op  │  allocs/op   vs base                │
OLTP/Gen4-20                                  3.612k ± 0%   3.712k ± 0%   +2.77% (p=0.000 n=10)
OLTP/Gen4Greedy-20                            3.612k ± 0%   3.712k ± 0%   +2.77% (p=0.000 n=10)
OLTP/Gen4Left2Right-20                        3.612k ± 0%   3.712k ± 0%   +2.77% (p=0.000 n=10)
TPCC/Gen4-20                                  25.70k ± 0%   26.56k ± 0%   +3.36% (p=0.000 n=10)
TPCC/Gen4Greedy-20                            25.70k ± 0%   26.56k ± 0%   +3.36% (p=0.000 n=10)
TPCC/Gen4Left2Right-20                        25.45k ± 0%   26.31k ± 0%   +3.39% (p=0.000 n=10)
TPCH/Gen4-20                                  134.0k ± 0%   134.7k ± 0%   +0.55% (p=0.000 n=10)
TPCH/Gen4Greedy-20                            134.0k ± 0%   134.7k ± 0%   +0.55% (p=0.000 n=10)
TPCH/Gen4Left2Right-20                        124.1k ± 0%   124.7k ± 0%   +0.52% (p=0.000 n=10)
Planner/from_cases.json-gen4-20               106.7k ± 0%   107.5k ± 0%   +0.71% (p=0.000 n=10)
Planner/from_cases.json-gen4left2right-20     101.5k ± 0%   102.2k ± 0%   +0.75% (p=0.000 n=10)
Planner/filter_cases.json-gen4-20             562.7k ± 0%   476.1k ± 0%  -15.40% (p=0.000 n=10)
Planner/filter_cases.json-gen4left2right-20   560.7k ± 0%   474.1k ± 0%  -15.45% (p=0.000 n=10)
Planner/large_cases.json-gen4-20              10.52k ± 0%   10.54k ± 0%   +0.17% (p=0.000 n=10)
Planner/large_cases.json-gen4left2right-20    6.206k ± 0%   6.224k ± 0%   +0.29% (p=0.000 n=10)
Planner/aggr_cases.json-gen4-20               164.8k ± 0%   165.6k ± 0%   +0.52% (p=0.000 n=10)
Planner/aggr_cases.json-gen4left2right-20     160.2k ± 0%   161.1k ± 0%   +0.54% (p=0.000 n=10)
Planner/select_cases.json-gen4-20             140.4k ± 0%   142.1k ± 0%   +1.20% (p=0.000 n=10)
Planner/select_cases.json-gen4left2right-20   137.4k ± 0%   139.1k ± 0%   +1.22% (p=0.000 n=10)
Planner/union_cases.json-gen4-20              52.06k ± 0%   52.50k ± 0%   +0.85% (p=0.000 n=10)
Planner/union_cases.json-gen4left2right-20    51.89k ± 0%   52.33k ± 0%   +0.85% (p=0.000 n=10)
SelectVsDML/DML_(random_sample,_N=32)-20      12.55k ± 0%   13.97k ± 0%  +11.34% (p=0.000 n=10)
SelectVsDML/Select_(random_sample,_N=32)-20   26.53k ± 0%   21.93k ± 0%  -17.33% (p=0.000 n=10)
geomean                                       46.06k        45.76k        -0.65%

Related Issue(s)

Checklist

"Backport to:" labels have been added if this change should be back-ported to release branches
If this change is to be back-ported to previous releases, a justification is included in the PR description
Tests were added or are not required
Did the new or modified tests pass consistently locally and on CI?
Documentation was added or is not required

Deployment Notes

Signed-off-by: Andres Taylor <[email protected]>

vitess-bot · 2024-08-28T05:59:30Z

Signed-off-by: Andres Taylor <[email protected]>

GuptaManan100

I like it!

Signed-off-by: Andres Taylor <[email protected]>

codecov · 2024-08-29T12:07:37Z

Codecov Report

Attention: Patch coverage is 90.46243% with 33 lines in your changes missing coverage. Please review.

Project coverage is 68.94%. Comparing base (773a216) to head (31b8f71).
Report is 14 commits behind head on main.

Files with missing lines	Patch %	Lines
go/vt/vtctl/workflow/vexec/query_planner.go	46.15%	7 Missing ⚠️
go/vt/sqlparser/ast_funcs.go	92.00%	6 Missing ⚠️
go/vt/sqlparser/predicate_rewriting.go	96.26%	5 Missing ⚠️
go/vt/vtgate/planbuilder/operators/update.go	70.58%	5 Missing ⚠️
go/vt/vtgate/simplifier/expression_simplifier.go	0.00%	4 Missing ⚠️
go/vt/vtgate/planbuilder/operators/ast_to_op.go	95.12%	2 Missing ⚠️
go/vt/sqlparser/random_expr.go	83.33%	1 Missing ⚠️
go/vt/vtctl/workflow/materializer.go	50.00%	1 Missing ⚠️
go/vt/vtctl/workflow/traffic_switcher.go	0.00%	1 Missing ⚠️
go/vt/vtctl/workflow/vexec/query_plan.go	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main   #16671      +/-   ##
==========================================
+ Coverage   68.92%   68.94%   +0.01%     
==========================================
  Files        1562     1564       +2     
  Lines      200941   201360     +419     
==========================================
+ Hits       138497   138825     +328     
- Misses      62444    62535      +91

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

vitess-bot · 2024-08-29T12:09:33Z

Hello! 👋

This Pull Request is now handled by arewefastyet. The current HEAD and future commits will be benchmarked.

You can find the performance comparison on the arewefastyet website.

frouioui

This looks good to me. We also have a bunch of queries with AND (with or without ORs) in our vitess-tester tests, nice.

frouioui · 2024-08-29T17:05:19Z

This Pull Request is now handled by arewefastyet. The current HEAD and future commits will be benchmarked.

I would not necessarily expect anything significant out of the benchmarks, query planning is a fraction of the overall performance.

systay · 2024-09-02T12:46:03Z

The performance numbers don't look good enough to warrant merging this

systay added 4 commits August 28, 2024 07:58

wip - change AndExpr to contain arbitrary many predicates

de1262e

Signed-off-by: Andres Taylor <[email protected]>

change semantics and planner to handle the new AndExpr

cf7430e

Signed-off-by: Andres Taylor <[email protected]>

last few uses of Left/Right

b0c5192

Signed-off-by: Andres Taylor <[email protected]>

refactoring

aa788bd

Signed-off-by: Andres Taylor <[email protected]>

github-actions bot added this to the v21.0.0 milestone Aug 28, 2024

codegen

f1fe39a

Signed-off-by: Andres Taylor <[email protected]>

GuptaManan100 reviewed Aug 28, 2024

View reviewed changes

systay added 4 commits August 28, 2024 08:49

move predicate simplification to later planner stages

bceda6f

Signed-off-by: Andres Taylor <[email protected]>

test: update expectations

5d1776c

Signed-off-by: Andres Taylor <[email protected]>

refactor: move OR simplifiying to later planner stage

07cbbe5

Signed-off-by: Andres Taylor <[email protected]>

refactor: minor code tweaks

fc653a1

Signed-off-by: Andres Taylor <[email protected]>

frouioui force-pushed the and-expr-predicates branch from 913f027 to 20f76ae Compare August 28, 2024 19:54

systay added 2 commits August 29, 2024 10:37

fix some of the faulty rewrites

d8c214b

Signed-off-by: Andres Taylor <[email protected]>

fixed remaining rewriter bugs

1de8b08

Signed-off-by: Andres Taylor <[email protected]>

systay force-pushed the and-expr-predicates branch from 20f76ae to 1de8b08 Compare August 29, 2024 09:52

minor cleanups

80815e7

Signed-off-by: Andres Taylor <[email protected]>

systay added Type: Internal Cleanup and removed Type: RFC Request For Comment labels Aug 29, 2024

dont stop too early when rewriting OR

8d2a401

Signed-off-by: Andres Taylor <[email protected]>

systay marked this pull request as ready for review August 29, 2024 11:38

systay requested review from deepthi, mattlord, rohit-nayak-ps, harshit-gangal, shlomi-noach, timvaillancourt, frouioui, arthurschreiber and ajm188 as code owners August 29, 2024 11:38

remove test not used any more

31b8f71

Signed-off-by: Andres Taylor <[email protected]>

harshit-gangal added the Benchmark me Add label to PR to run benchmarks label Aug 29, 2024

frouioui approved these changes Aug 29, 2024

View reviewed changes

systay closed this Sep 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change AndExpr to contain arbitrary many predicates #16671

change AndExpr to contain arbitrary many predicates #16671

systay commented Aug 28, 2024 •

edited

Loading

vitess-bot bot commented Aug 28, 2024

GuptaManan100 left a comment

codecov bot commented Aug 29, 2024 •

edited

Loading

vitess-bot bot commented Aug 29, 2024

frouioui left a comment

frouioui commented Aug 29, 2024

systay commented Sep 2, 2024

change AndExpr to contain arbitrary many predicates #16671

change AndExpr to contain arbitrary many predicates #16671

Conversation

systay commented Aug 28, 2024 • edited Loading

Description

Summary of Performance Changes:

Related Issue(s)

Checklist

Deployment Notes

vitess-bot bot commented Aug 28, 2024

Review Checklist

General

Tests

Documentation

New flags

If a workflow is added or modified:

Backward compatibility

GuptaManan100 left a comment

Choose a reason for hiding this comment

codecov bot commented Aug 29, 2024 • edited Loading

Codecov Report

vitess-bot bot commented Aug 29, 2024

frouioui left a comment

Choose a reason for hiding this comment

frouioui commented Aug 29, 2024

systay commented Sep 2, 2024

systay commented Aug 28, 2024 •

edited

Loading

codecov bot commented Aug 29, 2024 •

edited

Loading