Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer should recognize risk in cardinality estimation #59333

Open
terry1purcell opened this issue Feb 7, 2025 · 0 comments
Open

Optimizer should recognize risk in cardinality estimation #59333

terry1purcell opened this issue Feb 7, 2025 · 0 comments
Assignees
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@terry1purcell
Copy link
Contributor

Enhancement

Combining multiple predicates may exploit single column statistics (NDVs, topN, buckets) and index NDVs - with the combination of selectivities using either an exponential backoff or independence assumption. Neither of the backoff or independence are supported by statistics. It would be beneficial for the optimizer to acknowledge when an estimate was made with such assumptions.

Using the following DDL and insert for test purposes:

CREATE TABLE t2 (
a INT PRIMARY KEY,
b int,
c int,
d int,
e int,
Key (b, c, a),
key (e, d, a)
);

set @@cte_max_recursion_depth=10000000;
INSERT INTO t2 (a, b, c, d, e)
SELECT a, mod(a, 1000) AS b, mod(a, 1000) AS c, mod(a, 10000) as d, mod(a, 2) as e
FROM (
WITH RECURSIVE x AS (
SELECT 1 AS a
UNION ALL
SELECT a + 1 AS a
FROM x
WHERE a < 1000000
)
SELECT a
FROM x
) AS subquery;
ANALYZE TABLE t2;

Query examples

tidb> explain select * from t2 where b = 0 and c = 0 and d = 0 and e = 0;
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| IndexLookUp_11 | 0.00 | root | | |
| ├─IndexRangeScan_8(Build) | 31.45 | cop[tikv] | table:t2, index:b(b, c, a) | range:[0 0,0 0], keep order:false |
| └─Selection_10(Probe) | 0.00 | cop[tikv] | | eq(test.t2.d, 0), eq(test.t2.e, 0) |
| └─TableRowIDScan_9 | 31.45 | cop[tikv] | table:t2 | keep order:false |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
4 rows in set (0.01 sec)

tidb> select count() from t2 where b = 0 and c = 0;
+----------+
| count() |
+----------+
| 1000 |
+----------+
1 row in set (0.01 sec)

tidb> select count() from t2 where d = 0 and e = 0;
+----------+
| count() |
+----------+
| 100 |
+----------+
tidb> explain select * from t2 use index (e) where b = 0 and c = 0 and d = 0 and e = 0;
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| IndexLookUp_8 | 0.00 | root | | |
| ├─IndexRangeScan_5(Build) | 70.48 | cop[tikv] | table:t2, index:e(e, d, a) | range:[0 0,0 0], keep order:false |
| └─Selection_7(Probe) | 0.00 | cop[tikv] | | eq(test.t2.b, 0), eq(test.t2.c, 0) |
| └─TableRowIDScan_6 | 70.48 | cop[tikv] | table:t2 | keep order:false |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+

@terry1purcell terry1purcell added the type/enhancement The issue or PR belongs to an enhancement. label Feb 7, 2025
@terry1purcell terry1purcell self-assigned this Feb 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
Development

No branches or pull requests

1 participant