-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: Incorrect Index choice based on filtering crieteria #4901
Comments
Would be good to have:
We can see that the postgres planner thinks only one row is going to match the second query, let's check how many actually match looking at EXPLAIN ANALYZE. If this estimate is off, maybe running ANALYZE on the table will help? The |
Hi @akuzm Table and index definitions below: CREATE TABLE variable.signal_data (
tag_id int4 NOT NULL,
"timestamp" timestamp NOT NULL,
value float8 NULL,
updated_at timestamp NULL,
PRIMARY KEY (tag_id, "timestamp"),
FOREIGN KEY (tag_id) REFERENCES variable.signal_tag(id) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE UNIQUE INDEX ON variable.signal_data USING btree (tag_id, "timestamp" DESC);
CREATE INDEX ON variable.signal_data USING btree ("timestamp" DESC); Unfortunately, I can't run the full However, I tried running |
Can you run the two following queries? This will help us check how many rows there actually are, because planner thinks there's only one matching row for some reason.
We're looking here at the general postgres planner behavior, not specific to TimescaleDB. If ANALYZE doesn't help, and the estimate is badly off, we'll have to look at the actual data to debug this. By the way, if dropping indexes works for you, there's also the |
As requested: Querying the hypertable: Sort (cost=1322.05..1324.89 rows=1138 width=28) (actual time=0.114..0.130 rows=287 loops=1)
Sort Key: sd."timestamp"
Sort Method: quicksort Memory: 47kB
-> Index Scan using "200_400_pk_signal_data" on _hyper_2_200_chunk sd (cost=0.57..1264.28 rows=1138 width=28) (actual time=0.022..0.087 rows=287 loops=1)
Index Cond: ((tag_id = ANY ('{6846,6899}'::integer[])) AND ("timestamp" > '2022-10-27 00:00:00'::timestamp without time zone) AND ("timestamp" <= '2022-10-27 23:59:00'::timestamp without time zone))
Planning Time: 0.306 ms
Execution Time: 0.153 ms Querying the chunk directly: Sort (cost=1322.05..1324.89 rows=1138 width=28) (actual time=0.115..0.131 rows=287 loops=1)
Sort Key: "timestamp"
Sort Method: quicksort Memory: 47kB
-> Index Scan using "200_400_pk_signal_data" on _hyper_2_200_chunk sd (cost=0.57..1264.28 rows=1138 width=28) (actual time=0.021..0.087 rows=287 loops=1)
Index Cond: ((tag_id = ANY ('{6846,6899}'::integer[])) AND ("timestamp" > '2022-10-27 00:00:00'::timestamp without time zone) AND ("timestamp" <= '2022-10-27 23:59:00'::timestamp without time zone))
Planning Time: 0.273 ms
Execution Time: 0.156 ms Thanks for the tip on the |
Also, some details about the data, which may or may not be helpful... |
There is one test that I performed with @davidkohn88 which I think is worth sharing, too. I copied the data in the chunk to a test table (a normal PostgreSQL table, not converted to an hypertable). CREATE TABLE variable.signal_data_test (
tag_id int4 NOT NULL,
"timestamp" timestamp NOT NULL,
value float8 NULL,
updated_at timestamp NULL,
PRIMARY KEY (tag_id, "timestamp"),
FOREIGN KEY (tag_id) REFERENCES variable.signal_tag(id) ON DELETE CASCADE ON UPDATE CASCADE
);
CREATE UNIQUE INDEX ON variable.signal_data_test USING btree (tag_id, "timestamp" DESC);
CREATE INDEX ON variable.signal_data_test USING btree ("timestamp" DESC);
insert into variable.signal_data_test
select
*
from
_timescaledb_internal._hyper_2_236_chunk; Then I queried with both where clauses. In both cases, the planner picked up the composite index. |
So, now the estimate is magically fixed 😄 It doesn't say
This mean the issue does not reproduce, right? The composite index is the correct one we want it to choose. |
Yeah, it is fixed, but I can't tell how it got fixed. I am pretty sure that I have run Unfortunately, I can't reproduce the issue anymore, as I deleted the index to get things working asap since it happened in our production environment. In the current state, I think I can't provide more information to maybe find a bug or just make sure I had a bad estimate in the table, so feel free to close this issue as you see fit. If I find any similar problem, I will try to collect more evidence and open a new bug report.
Hmm, actually, no. I meant that the issue can be reproduced in both chunks, i.e., the problem persisted (until I deleted the index and then everything is solved). |
Yes, to be clear, I asked him to do this because I've seen issues with this in the past, it seems to pop up intermittently, and I'd never thought to do the test where we copy the data from a chunk into a normal Postgres table with the same schema and run the same query. To summarize what he did as I understand it: he was able to reproduce a) the bad behavior on the chunk he copied over when querying the hypertable and b) the good behavior on the Postgres table with all of the same indexes etc. He can't reproduce the behavior now, because he dropped the time only index in order to skirt the problem and solve it quickly on the production table. @gustavo-delfosim I think you might be able to reproduce the behavior if you create a hypertable with both indexes, and copy the data from the one chunk you already have in the normal Postgres table over to that? Then we could run some more tests for @akuzm ? But it might be harder to reproduce if there's only one chunk, not sure. |
@davidkohn88 Yeah, thanks for summarizing, that's exactly what happened. I can try to reproduce it again with a new table and copying more data into it, but I will need to fine some time to looking into it. I will report back if I am able to reproduce it |
What we can check next time it reproduces, to help investigate the problem:
|
Dear Author, This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days. Thank you! |
Dear Author, We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it. Thank you! |
I currently struggling with a very similar situation. Also in my case the time-only index is being used instead of the composite index for entity and time. The query plans show the using the "right" index is about three times faster. The first query plan uses the wrong, time-only index. After disabling this index the second query uses the right composite index and is significantly faster.
Basically none of my queries will use the composite index and all of them have the |
Hello @sithmein, Thanks for providing all this information. As you said, the query runs faster when the composite index is used (actual execution time: 286.897 ms (composite index) vs. 838.597 ms (normal index)). However, during query planning, PostgreSQL expects a slightly higher execution time when the composite index is used (23424.05 on What happens when you run the query directly against the chunk |
I also tried on the latest chunk with the same results. |
Hi @sithmein, Thanks for getting back to us. It would be great if you could repeat this and send us the output of the |
@sithmein can you shed some light on how you ended up fixing the issue? It seems like this isn’t close to being resolved. We are experiencing the exact same issue in production and would love to resolve this ASAP. Which index did you end up dropping/re-creating? Only difference is we're using a distributed hypertable.. |
Hi, we are facing the exact same issue, on some chunks of hypertable wrong index is used (single column) but we have multicolumn index added. I am testing this on a simple select query, note that this table is huge; explain analyze select * from competitor_results where competitor_id = 106302 and created_at between '2023-04-01' and '2023-06-01';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Append (cost=0.00..1044934.73 rows=1170117 width=168) (actual time=93.403..4657.537 rows=1138932 loops=1)
-> Seq Scan on competitor_results (cost=0.00..0.00 rows=1 width=136) (actual time=0.003..0.003 rows=0 loops=1)
Filter: ((created_at >= '2023-04-01 00:00:00'::timestamp without time zone) AND (created_at <= '2023-06-01 00:00:00'::timestamp without time zone) AND (competitor_id = 106302))
-> Bitmap Heap Scan on _hyper_2_507_chunk (cost=10770.51..586978.35 rows=634019 width=168) (actual time=93.399..2504.064 rows=626350 loops=1)
Recheck Cond: ((competitor_id = 106302) AND (created_at >= '2023-04-01 00:00:00'::timestamp without time zone) AND (created_at <= '2023-06-01 00:00:00'::timestamp without time zone))
Rows Removed by Index Recheck: 14806542
Heap Blocks: exact=42539 lossy=208208
-> Bitmap Index Scan on _hyper_2_507_chunk_competitor_results_hyper_competitor_id_creat (cost=0.00..10612.00 rows=634019 width=0) (actual time=83.864..83.867 rows=626350 loops=1)
Index Cond: ((competitor_id = 106302) AND (created_at >= '2023-04-01 00:00:00'::timestamp without time zone) AND (created_at <= '2023-06-01 00:00:00'::timestamp without time zone)
-> Bitmap Heap Scan on _hyper_2_510_chunk (cost=5772.32..457956.38 rows=536097 width=169) (actual time=76.031..2083.065 rows=512582 loops=1)
Recheck Cond: (competitor_id = 106302)
Rows Removed by Index Recheck: 12325718
Filter: ((created_at >= '2023-04-01 00:00:00'::timestamp without time zone) AND (created_at <= '2023-06-01 00:00:00'::timestamp without time zone))
Heap Blocks: exact=47950 lossy=173593
-> Bitmap Index Scan on _hyper_2_510_chunk_competitor_results_hyper_competitor_id_idx (cost=0.00..5638.30 rows=536097 width=0) (actual time=64.878..64.879 rows=512582 loops=1)
Index Cond: (competitor_id = 106302)
Planning time: 13.877 ms
Execution time: 4699.353 ms If you check above I checked the \d+ _timescaledb_internal._hyper_2_510_chunk;
Table "_timescaledb_internal._hyper_2_510_chunk"
Column | Type | Collation | Nullable | Default | Storage | Stats target | Description
---------------------------+-----------------------------+-----------+----------+------------------------------------------------+----------+--------------+-------------
id | integer | | not null | nextval('competitor_results_id_seq'::regclass) | plain | |
position | integer | | | | plain | |
change | integer | | | | plain | |
search_keyword_url_id | integer | | not null | | plain | |
search_keyword_id | integer | | not null | | plain | |
competitor_id | integer | | not null | | plain | |
search_keyword_result_id | integer | | not null | | plain | |
created_at | timestamp without time zone | | | | plain | |
updated_at | timestamp without time zone | | | | plain | |
copied_from_id | integer | | | | plain | |
account_id | integer | | | | plain | |
result_url | text | | | | extended | |
position_type | character varying | | | | extended | |
position_organic | integer | | | | plain | |
position_local_pack | integer | | | | plain | |
position_places_image | integer | | | | plain | |
position_featured_snippet | integer | | | | plain | |
position_knowledge_panel | integer | | | | plain | |
Indexes:
"_hyper_2_510_chunk_competitor_results_hyper_competitor_id_creat" btree (competitor_id, created_at)
"_hyper_2_510_chunk_competitor_results_hyper_competitor_id_idx" btree (competitor_id)
"_hyper_2_510_chunk_competitor_results_hyper_created_at_idx" btree (created_at DESC)
"_hyper_2_510_chunk_competitor_results_hyper_id_idx" btree (id)
"_hyper_2_510_chunk_competitor_results_hyper_search_keyword_id_i" btree (search_keyword_id)
"_hyper_2_510_chunk_competitor_results_hyper_search_keyword_resu" btree (search_keyword_result_id)
"_hyper_2_510_chunk_competitor_results_hyper_search_keyword_ur_1" btree (search_keyword_url_id)
"_hyper_2_510_chunk_competitor_results_hyper_search_keyword_url_" btree (search_keyword_url_id, created_at DESC)
Check constraints:
"constraint_510" CHECK (created_at >= '2023-04-23 00:00:00'::timestamp without time zone AND created_at < '2023-05-23 00:00:00'::timestamp without time zone)
Inherits: competitor_results Any help or suggestions how to make query plan picking the right index would be highly appreciated. |
Hello @jkuri, Thanks for reporting this problem. According to the query plan, two chunks are part of the query and PostgreSQL uses the composite index Which version of TimescaleDB do you use? Does the query plan change if you perform an |
Hi @jnidzwetzki, thank you for the quick response. I am testing this on I performed I'd like to ask when I have a chance about the performance issues on that hypertable, why is simple select taking so long to return the result? If I pick a date range for 6 months for example, the simple query takes about a minute to execute, but that is a common case. table_size | index_size | toast_size | total_size
------------+------------+------------+------------
194 GB | 335 GB | 776 kB | 530 GB And each chunk is about |
Hello @jkuri, Version 1.7.5 of TimescaleDB is a very old version. I recommend upgrading to TimescaleDB 2.11 and checking the query plan and the execution time again. The new TimescaleDB version contains a lot of bug fixes and additional optimizations. The chunk size has to match your use case and the hardware you are using. On the following two pages of our documentation, you will find several tips on determining an appropriate chunk size (page 1, page 2). |
hi @jnidzwetzki, okay, we will update the database version and will come back with the results. thanks for your help. |
Dear Author, This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days. Thank you! |
Dear Author, We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it. Thank you! |
What type of bug is this?
Performance issue
What subsystems and features are affected?
Query planner
What happened?
This is a bug report following a discussion started in Slack with @davidkohn88 . I got it "solved" for my use-case by deleting the "bad" index used by Query 2 (see below). However, @davidkohn88 believes it could be a bug related to the query planner, so I am reporting here to hopefully help TimescaleDB team to track and fix it.
I am getting a very strange query plan when using timestamp filters with non-comple 'hours', i.e., with a timestamp with HH:MM not being 00:00.
Query 1:
Planner 1:
Query 2: Only changing the timestamp <= ... filter by 1 minute!
For some reason, Query 2 is not able to use the
(tag_id, timestamp desc)
index used by Query 1. The are manytag_id
s in this table, so the planning on Query 2 leads to very bad performance.I already tried to run
reorder_chunk
on that given chunk (_hyper_2_200_chunk
), but without success.Any insight on why it is happening?
TimescaleDB version affected
2.6.0
PostgreSQL version used
14.2
What operating system did you use?
Probably some recent version of Debian or Ubuntu, not sure.
What installation method did you use?
Docker
What platform did you run on?
Not applicable
Relevant log output and stack trace
No response
How can we reproduce the bug?
I tried to reproduce it by recreating a table with the same configuration and copying the data into it, but the new table ended up using the correct index.
Unfortunately I cannot share the data of the table, but please let me know if I can help with anything else.
The text was updated successfully, but these errors were encountered: