[Bug]: After table compression database size increased a lot #5850

yinan8128 · 2023-07-03T02:09:51Z

What type of bug is this?

Other

What subsystems and features are affected?

Compression, Other

What happened?

Our production encounter the issue that after table compression table size reduced but database size increased a lot

Table and compression defination

SELECT create_hypertable('tag', 'date', if_not_exists => TRUE);

DO language plpgsql
$$
DECLARE var_sql text;
BEGIN
IF NOT EXISTS(SELECT * FROM chunk_compression_stats('tag') WHERE compression_status = 'Compressed') THEN
                var_sql := 'ALTER TABLE {tag SET (timescaledb.compress, timescaledb.compress_orderby = 'date DESC', timescaledb.compress_segmentby = 'id');';
                execute var_sql;
end if;
END;
$$;
SELECT add_retention_policy('tag', INTERVAL '30 days', if_not_exists => TRUE);
SELECT add_compression_policy('tag', INTERVAL '8 days', if_not_exists => TRUE);

\d+ tag;
                                  Table "tag"
     Column      |            Type             | Collation | Nullable | Default | Storage  | Compression | Stats target | Description
-----------------+-----------------------------+-----------+----------+---------+----------+-------------+--------------+-------------
 tenantid        | uuid                        |           | not null |         | plain    |             |              |
 resourceid      | uuid                        |           | not null |         | plain    |             |              |
 value           | numeric                     |           | not null |         | main     |             |              |
 context         | uuid                        |           | not null |         | plain    |             |              |
 topic           | text                        |           | not null |         | extended |             |              |
 tag             | text                        |           | not null |         | extended |             |              |
 id              | uuid                        |           | not null |         | plain    |             |              |
 date            | timestamp without time zone |           | not null |         | plain    |             |              |
 created_at      | timestamp without time zone |           | not null |         | plain    |             |              |
 visibility_list | uuid[]                      |           |          |         | extended |             |              |
Indexes:
    "tag_pkey" PRIMARY KEY, btree (id, date)
    "tag_date_idx" btree (date DESC)
    "index_tag_date" btree (resourceid, date DESC)
Triggers:
    ts_insert_blocker BEFORE INSERT ON timeseries.tag FOR EACH ROW EXECUTE FUNCTION _timescaledb_internal.insert_blocker()
Access method: heap

SELECT h.table_name, c.interval_length
  FROM _timescaledb_catalog.dimension c
  JOIN _timescaledb_catalog.hypertable h
    ON h.id = c.hypertable_id;
               table_name                | interval_length
-----------------------------------------+-----------------
 tag                                |    604800000000

Table size before compression:
3.23GB

Table size after compression:
847mb

Total database size before compression:
93.4GB

Total database size after compression:
101GB

There are no other tables compressions happening simultaneously

There is another ticket #5582, the ticket space not claimed but no database size increased

Please suggest, thanks

TimescaleDB version affected

2.10.1

PostgreSQL version used

14.6

What operating system did you use?

CentOS Linux release 7.9.2009

What installation method did you use?

RPM

What platform did you run on?

On prem/Self-hosted

Relevant log output and stack trace

No response

How can we reproduce the bug?

N.A

The text was updated successfully, but these errors were encountered:

lkshminarayanan · 2023-07-03T07:15:10Z

Hi @yinan8128,

Thank you for filing the issue!

As explained in #5582, this could happen if the compression error'ed out for some reason before reclaiming the space.

Running vacuum on the hypertable will not work as it is not be propagated to the individual chunks in version 2.10.1. This is now fixed in version 2.11.0. Upgrading to that version and running vacuum on the hypertable should fix your issue.

Alternatively, you can continue using version 2.10.1 but you would have to run the vacuum on all individual chunks manually.

Can you try either of these and let us know if that helps improve your situation?

yinan8128 · 2023-07-04T06:16:19Z

Hi

Thank you for your reply

vacuum script

select 'vacuum verbose'||' '||chunk_schema||'.'||chunk_name||';' from timescaledb_information.chunks where hypertable_name='tag' order by hypertable_name,chunk_name;
                         ?column?                          
-----------------------------------------------------------
 vacuum verbose _timescaledb_internal._hyper_19_10_chunk;
 vacuum verbose _timescaledb_internal._hyper_19_246_chunk;
 vacuum verbose _timescaledb_internal._hyper_19_258_chunk;
 vacuum verbose _timescaledb_internal._hyper_19_43_chunk;
 vacuum verbose _timescaledb_internal._hyper_19_47_chunk;

vacuum log in ansible

Something happen INFO:  vacuuming "_timescaledb_internal._hyper_19_10_chunk"
INFO:  table "_hyper_19_10_chunk": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242365
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "pg_toast.pg_toast_18292"
INFO:  table "pg_toast_18292": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242365
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "_timescaledb_internal._hyper_19_246_chunk"
INFO:  launched 2 parallel vacuum workers for index cleanup (planned: 2)
INFO:  table "_hyper_19_246_chunk": found 0 removable, 227775 nonremovable row versions in 9176 out of 53563 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242365
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.17 s, system: 0.14 s, elapsed: 6.33 s.
INFO:  vacuuming "pg_toast.pg_toast_69237"
INFO:  table "pg_toast_69237": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242373
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "_timescaledb_internal._hyper_19_258_chunk"
INFO:  launched 2 parallel vacuum workers for index cleanup (planned: 2)
INFO:  table "_hyper_19_258_chunk": found 0 removable, 251045 nonremovable row versions in 10068 out of 73831 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242373
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.08 s, system: 0.08 s, elapsed: 0.18 s.
INFO:  vacuuming "pg_toast.pg_toast_75680"
INFO:  table "pg_toast_75680": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242373
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "_timescaledb_internal._hyper_19_43_chunk"
INFO:  table "_hyper_19_43_chunk": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242373
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "pg_toast.pg_toast_21713"
INFO:  table "pg_toast_21713": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242373
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  vacuuming "_timescaledb_internal._hyper_19_47_chunk"
INFO:  scanned index "47_38_events_monitoring_netitag_pkey" to remove 62 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.03 s
INFO:  scanned index "_hyper_19_47_chunk_events_monitoring_netitag_date_idx" to remove 62 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  scanned index "_hyper_19_47_chunk_index_events_monitoring_netitag_date" to remove 62 row versions
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  table "_hyper_19_47_chunk": removed 62 dead item identifiers in 4 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
INFO:  index "47_38_events_monitoring_netitag_pkey" now contains 0 row versions in 2 pages
DETAIL:  62 index row versions were removed.
0 index pages were newly deleted.
0 index pages are currently deleted, of which 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "_hyper_19_47_chunk_events_monitoring_netitag_date_idx" now contains 0 row versions in 2 pages
DETAIL:  62 index row versions were removed.
0 index pages were newly deleted.
0 index pages are currently deleted, of which 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  index "_hyper_19_47_chunk_index_events_monitoring_netitag_date" now contains 0 row versions in 2 pages
DETAIL:  62 index row versions were removed.
0 index pages were newly deleted.
0 index pages are currently deleted, of which 0 are currently reusable.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.
INFO:  table "_hyper_19_47_chunk": found 62 removable, 0 nonremovable row versions in 4 out of 4 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242373
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.03 s.
INFO:  table "_hyper_19_47_chunk": truncated 4 to 0 pages
DETAIL:  CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.05 s
INFO:  vacuuming "pg_toast.pg_toast_27725"
INFO:  table "pg_toast_27725": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet, oldest xmin: 2242375
Skipped 0 pages due to buffer pins, 0 frozen pages.
CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s.

Compared the database size, no released, should we run vacuum full or vacuum is okay please?

lkshminarayanan · 2023-07-04T07:49:54Z

Hi @yinan8128,

You can try VACUUM FULL but please note that it requires an ACCESS EXCLUSIVE lock on the table, and therefore cannot be done in parallel with any other READ/WRITE operations on the table.

yinan8128 · 2023-07-05T10:43:33Z

Hi

Thank you for your reply

We checked timescaledb_information.job_errors, there are no error log of 2023-06-30, but still some compression problem happened in 2023-06-30, is it possible that timescaledb_information.job_errors not record all compression job error?
What is the reason caused compression failure and how could we check the failure please? Thanks

lkshminarayanan · 2023-07-27T13:32:59Z

Hi @yinan8128,

Any error in the jobs will be recorded in job_errors table. So, if you don't see that there, it is highly unlikely that the compression job failed.

But as your database size increased after compression, the most likely explanation is that the old uncompressed rows have not been vacuumed yet. It could be the case that vacuum has not freed up those rows yet due to some long running transaction. You can follow this - https://emmer.dev/blog/finding-long-running-queries-in-postgresql/ - to check if you have any of those. If you did have such a transaction running, cancel that transaction and then retry vacuum again.

As for why this has happened, it could be that you have hit a bug. But without an error log or an entry in job_errors, this might be harder to investigate.

sb230132 · 2023-07-27T13:53:27Z

Hi @yinan8128
It would of much help if we know what errors are occurring as part of compression policy getting triggered in the background. When a compression policy fails, we do not reclaim the space occupied as part of compression. Might be the case that compression policy failed multiple times and we ended up occupying more space. However this is just a guess, we need the exact errors about what is happening with the compression policy job.

Any chance to re-run your setup on timescaledb 2.11 and see if the issue is resolved.

yinan8128 · 2023-07-31T12:38:18Z

Hi

Thank you for the update

We already disabled the compression in the production env, let's check another production env which have problem and feedback to you

yinan8128 · 2023-08-01T07:19:58Z

Hi

In our one env, there is database size increase problem after compression, retention reduce some storage size, but storage size usage trend is increasing, sooner or later will hit the limit

Any suggestion for the next step please? Thanks

konskov · 2023-08-01T11:20:30Z

Hello @yinan8128,

Some more questions:

Is it only the compression policy that fails to run? Or do you have other background jobs/policies which are not running? I’m asking this because it is possible for the background job scheduler to exit and no jobs would run because of that, until it is restarted (although it seems from the graph that the retention policy is running normally).
Have you tried manually compressing your chunks to see if you get any errors then? Does compression complete normally in that case and is the space consumption reduced?

yinan8128 · 2023-08-01T13:15:42Z

Hello

There are retention jobs and compression jobs, there is no timescaledb_information.job_errors records which means no job failure?

From the attached image can see that retention worked(database size reduced 4 times), compression caused database size increased 7 times

erimatnor · 2023-08-14T09:40:48Z

@yinan8128 Note that you need to upgrade to 2.11 to allow a VACUUM on the hypertable to release disk space. Your earlier vacuum script won't work because the informational view you used doesn't show the "internal" compressed chunks. You need to query the _timescaledb_catalog.chunk table to get all chunks.

yinan8128 · 2023-08-15T02:03:48Z

Hi

We found that manual compression also can make chunk and table size increased, added reproduce steps to the ticket, #5961 could you please help check?

pdipesh02 · 2023-08-29T06:12:18Z

Hi @yinan8128 ,

Replied to questions asked as part of issue #5961. Please let us know if you have any further questions.

github-actions · 2023-11-11T01:45:46Z

Dear Author,

This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days.

Thank you!

yinan8128 · 2023-11-21T07:08:00Z

How about the relationship of compression rate, query efficiency and data？
For example 10000 records:

1000 unique values, each value 10 records
100 unique values, each value 100 records
10 unique values, each value 1000 records

Which cases can get good compression rate and query efficieny please? Will case 3 get the best compression rate but worst query efficiency as query need to decompress compressed chunk firstly please? Thanks

antekresic · 2023-12-04T10:04:51Z

Hi,

I believe this was answered in your other issue:
#5961 (comment)

Thanks.

github-actions · 2024-11-19T02:03:41Z

Dear Author,

This issue has been automatically marked as stale due to lack of activity. With only the issue description that is currently provided, we do not have enough information to take action. If you have or find the answers we would need, please reach out. Otherwise, this issue will be closed in 30 days.

Thank you!

github-actions · 2024-12-20T02:00:00Z

Dear Author,

We are closing this issue due to lack of activity. Feel free to add a comment to this issue if you can provide more information and we will re-open it.

Thank you!

yinan8128 added the bug label Jul 3, 2023

lkshminarayanan added compression waiting-for-author labels Jul 3, 2023

timescale-automation removed the waiting-for-author label Jul 4, 2023

lkshminarayanan added the waiting-for-author label Jul 4, 2023

timescale-automation removed the waiting-for-author label Jul 5, 2023

lkshminarayanan added the waiting-for-author label Jul 27, 2023

timescale-automation removed the waiting-for-author label Jul 31, 2023

konskov added the waiting-for-author label Aug 1, 2023

timescale-automation removed the waiting-for-author label Aug 1, 2023

mkindahl assigned erimatnor Aug 14, 2023

erimatnor removed their assignment Aug 28, 2023

pdipesh02 self-assigned this Aug 28, 2023

pdipesh02 removed their assignment Aug 29, 2023

jnidzwetzki added the waiting-for-author label Sep 11, 2023

github-actions bot added the no-activity label Nov 11, 2023

timescale-automation removed waiting-for-author no-activity labels Nov 19, 2023

horzsolt added the waiting-for-author label Dec 18, 2023

github-actions bot added the no-activity label Nov 19, 2024

github-actions bot closed this as completed Dec 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: After table compression database size increased a lot #5850

[Bug]: After table compression database size increased a lot #5850

yinan8128 commented Jul 3, 2023 •

edited

Loading

lkshminarayanan commented Jul 3, 2023

yinan8128 commented Jul 4, 2023 •

edited

Loading

lkshminarayanan commented Jul 4, 2023

yinan8128 commented Jul 5, 2023

lkshminarayanan commented Jul 27, 2023

sb230132 commented Jul 27, 2023

yinan8128 commented Jul 31, 2023

yinan8128 commented Aug 1, 2023 •

edited

Loading

konskov commented Aug 1, 2023

yinan8128 commented Aug 1, 2023

erimatnor commented Aug 14, 2023

yinan8128 commented Aug 15, 2023

pdipesh02 commented Aug 29, 2023

github-actions bot commented Nov 11, 2023

yinan8128 commented Nov 21, 2023 •

edited

Loading

antekresic commented Dec 4, 2023

github-actions bot commented Nov 19, 2024

github-actions bot commented Dec 20, 2024

[Bug]: After table compression database size increased a lot #5850

[Bug]: After table compression database size increased a lot #5850

Comments

yinan8128 commented Jul 3, 2023 • edited Loading

What type of bug is this?

What subsystems and features are affected?

What happened?

TimescaleDB version affected

PostgreSQL version used

What operating system did you use?

What installation method did you use?

What platform did you run on?

Relevant log output and stack trace

How can we reproduce the bug?

lkshminarayanan commented Jul 3, 2023

yinan8128 commented Jul 4, 2023 • edited Loading

lkshminarayanan commented Jul 4, 2023

yinan8128 commented Jul 5, 2023

lkshminarayanan commented Jul 27, 2023

sb230132 commented Jul 27, 2023

yinan8128 commented Jul 31, 2023

yinan8128 commented Aug 1, 2023 • edited Loading

konskov commented Aug 1, 2023

yinan8128 commented Aug 1, 2023

erimatnor commented Aug 14, 2023

yinan8128 commented Aug 15, 2023

pdipesh02 commented Aug 29, 2023

github-actions bot commented Nov 11, 2023

yinan8128 commented Nov 21, 2023 • edited Loading

antekresic commented Dec 4, 2023

github-actions bot commented Nov 19, 2024

github-actions bot commented Dec 20, 2024

yinan8128 commented Jul 3, 2023 •

edited

Loading

yinan8128 commented Jul 4, 2023 •

edited

Loading

yinan8128 commented Aug 1, 2023 •

edited

Loading

yinan8128 commented Nov 21, 2023 •

edited

Loading