You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Combining multiple predicates may exploit single column statistics (NDVs, topN, buckets) and index NDVs - with the combination of selectivities using either an exponential backoff or independence assumption. Neither of the backoff or independence are supported by statistics. It would be beneficial for the optimizer to acknowledge when an estimate was made with such assumptions.
Using the following DDL and insert for test purposes:
CREATE TABLE t2 (
a INT PRIMARY KEY,
b int,
c int,
d int,
e int,
Key (b, c, a),
key (e, d, a)
);
set @@cte_max_recursion_depth=10000000;
INSERT INTO t2 (a, b, c, d, e)
SELECT a, mod(a, 1000) AS b, mod(a, 1000) AS c, mod(a, 10000) as d, mod(a, 2) as e
FROM (
WITH RECURSIVE x AS (
SELECT 1 AS a
UNION ALL
SELECT a + 1 AS a
FROM x
WHERE a < 1000000
)
SELECT a
FROM x
) AS subquery;
ANALYZE TABLE t2;
Query examples
tidb> explain select * from t2 where b = 0 and c = 0 and d = 0 and e = 0;
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| IndexLookUp_11 | 0.00 | root | | |
| ├─IndexRangeScan_8(Build) | 31.45 | cop[tikv] | table:t2, index:b(b, c, a) | range:[0 0,0 0], keep order:false |
| └─Selection_10(Probe) | 0.00 | cop[tikv] | | eq(test.t2.d, 0), eq(test.t2.e, 0) |
| └─TableRowIDScan_9 | 31.45 | cop[tikv] | table:t2 | keep order:false |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
4 rows in set (0.01 sec)
tidb> select count() from t2 where b = 0 and c = 0;
+----------+
| count() |
+----------+
| 1000 |
+----------+
1 row in set (0.01 sec)
tidb> select count() from t2 where d = 0 and e = 0;
+----------+
| count() |
+----------+
| 100 |
+----------+
tidb> explain select * from t2 use index (e) where b = 0 and c = 0 and d = 0 and e = 0;
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| IndexLookUp_8 | 0.00 | root | | |
| ├─IndexRangeScan_5(Build) | 70.48 | cop[tikv] | table:t2, index:e(e, d, a) | range:[0 0,0 0], keep order:false |
| └─Selection_7(Probe) | 0.00 | cop[tikv] | | eq(test.t2.b, 0), eq(test.t2.c, 0) |
| └─TableRowIDScan_6 | 70.48 | cop[tikv] | table:t2 | keep order:false |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
The text was updated successfully, but these errors were encountered:
Enhancement
Combining multiple predicates may exploit single column statistics (NDVs, topN, buckets) and index NDVs - with the combination of selectivities using either an exponential backoff or independence assumption. Neither of the backoff or independence are supported by statistics. It would be beneficial for the optimizer to acknowledge when an estimate was made with such assumptions.
Using the following DDL and insert for test purposes:
CREATE TABLE t2 (
a INT PRIMARY KEY,
b int,
c int,
d int,
e int,
Key (b, c, a),
key (e, d, a)
);
set @@cte_max_recursion_depth=10000000;
INSERT INTO t2 (a, b, c, d, e)
SELECT a, mod(a, 1000) AS b, mod(a, 1000) AS c, mod(a, 10000) as d, mod(a, 2) as e
FROM (
WITH RECURSIVE x AS (
SELECT 1 AS a
UNION ALL
SELECT a + 1 AS a
FROM x
WHERE a < 1000000
)
SELECT a
FROM x
) AS subquery;
ANALYZE TABLE t2;
Query examples
tidb> explain select * from t2 where b = 0 and c = 0 and d = 0 and e = 0;
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| IndexLookUp_11 | 0.00 | root | | |
| ├─IndexRangeScan_8(Build) | 31.45 | cop[tikv] | table:t2, index:b(b, c, a) | range:[0 0,0 0], keep order:false |
| └─Selection_10(Probe) | 0.00 | cop[tikv] | | eq(test.t2.d, 0), eq(test.t2.e, 0) |
| └─TableRowIDScan_9 | 31.45 | cop[tikv] | table:t2 | keep order:false |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
4 rows in set (0.01 sec)
tidb> select count() from t2 where b = 0 and c = 0;
+----------+
| count() |
+----------+
| 1000 |
+----------+
1 row in set (0.01 sec)
tidb> select count() from t2 where d = 0 and e = 0;
+----------+
| count() |
+----------+
| 100 |
+----------+
tidb> explain select * from t2 use index (e) where b = 0 and c = 0 and d = 0 and e = 0;
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| id | estRows | task | access object | operator info |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
| IndexLookUp_8 | 0.00 | root | | |
| ├─IndexRangeScan_5(Build) | 70.48 | cop[tikv] | table:t2, index:e(e, d, a) | range:[0 0,0 0], keep order:false |
| └─Selection_7(Probe) | 0.00 | cop[tikv] | | eq(test.t2.b, 0), eq(test.t2.c, 0) |
| └─TableRowIDScan_6 | 70.48 | cop[tikv] | table:t2 | keep order:false |
+-------------------------------+---------+-----------+----------------------------+------------------------------------+
The text was updated successfully, but these errors were encountered: