Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schedule compression policy more often #6102

Merged

Conversation

jnidzwetzki
Copy link
Contributor

@jnidzwetzki jnidzwetzki commented Sep 21, 2023

By default, the compression policy is scheduled for every chunk_time_interval / 2 in the current implementation, equal to three days and twelve hours with our default settings. This schedule interval was sufficient for previous versions of TimescaleDB. However, with the introduction of features like mutable compression and ON CONFLICT .. DO UPDATE queries, regular DML operations decompress data. To ensure that modified data is compressed earlier, this patch reduces the schedule interval of the compression policy to run at least every 12 hours.


Disable-check: force-changelog-file

@jnidzwetzki jnidzwetzki force-pushed the decrease_compression_schedule_interval branch from 8de65fe to 0d7f95b Compare September 21, 2023 09:55
@jnidzwetzki jnidzwetzki force-pushed the decrease_compression_schedule_interval branch 2 times, most recently from 95b135e to 66c7069 Compare September 21, 2023 10:36
@codecov
Copy link

codecov bot commented Sep 21, 2023

Codecov Report

Merging #6102 (1e78a19) into main (8c41757) will increase coverage by 7.54%.
Report is 5 commits behind head on main.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main    #6102      +/-   ##
==========================================
+ Coverage   73.96%   81.50%   +7.54%     
==========================================
  Files         246      246              
  Lines       49862    56731    +6869     
  Branches    12525    12569      +44     
==========================================
+ Hits        36880    46240    +9360     
- Misses       7128     8083     +955     
+ Partials     5854     2408    -3446     
Files Changed Coverage Δ
tsl/src/bgw_policy/compression_api.c 82.07% <100.00%> (+10.42%) ⬆️

... and 226 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@jnidzwetzki jnidzwetzki force-pushed the decrease_compression_schedule_interval branch 5 times, most recently from 275bab2 to 634fde1 Compare September 21, 2023 12:05

PERFORM add_reorder_policy('policy_test_timestamptz','policy_test_timestamptz_time_idx');

-- some policy API functions got renamed for 2.0 so we need to make
-- sure to use the right name for the version
IF ts_version < '2.0.0' THEN
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to ts_major and ts_minor because semantic version compares with a text variable does not really work.

PERFORM add_retention_policy('policy_test_timestamptz','60d'::interval);
PERFORM add_compression_policy('policy_test_timestamptz','10d'::interval);
ELSE
PERFORM add_retention_policy('policy_test_timestamptz','60d'::interval);
PERFORM add_compression_policy('policy_test_timestamptz','10d'::interval, schedule_interval => '3 days 12:00:00'::interval);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to use the same schedule_interval across all timescaledb versions and make the upgrade tests pass.

@jnidzwetzki jnidzwetzki marked this pull request as ready for review September 21, 2023 12:26
@github-actions github-actions bot requested review from akuzm and mahipv September 21, 2023 12:26
@github-actions
Copy link

@mahipv, @akuzm: please review this pull request.

Powered by pull-review

scripts/docker-build.sh Outdated Show resolved Hide resolved
Copy link
Member

@akuzm akuzm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably requires a doc update.

@jnidzwetzki jnidzwetzki force-pushed the decrease_compression_schedule_interval branch from 634fde1 to 676ae97 Compare September 21, 2023 13:35
@jnidzwetzki jnidzwetzki force-pushed the decrease_compression_schedule_interval branch from 676ae97 to c21f99c Compare September 22, 2023 10:03
By default, the compression policy is scheduled for every
chunk_time_interval / 2 in the current implementation, equal to three
days and twelve hours with our default settings. This schedule interval
was sufficient for previous versions of TimescaleDB. However, with the
introduction of features like mutable compression and ON CONFLICT .. DO
UPDATE queries, regular DML operations decompress data. To ensure that
modified data is compressed earlier, this patch reduces the schedule
interval of the compression policy to run at least every 12 hours.
@jnidzwetzki jnidzwetzki force-pushed the decrease_compression_schedule_interval branch from c21f99c to 1e78a19 Compare September 22, 2023 10:05
jnidzwetzki added a commit to jnidzwetzki/docs-timescale that referenced this pull request Sep 22, 2023
This PR updates the documentation regarding the changed default value of the schedule_interval parameter of the compression_policy (see timescale/timescaledb#6102).
jnidzwetzki added a commit to jnidzwetzki/docs-timescale that referenced this pull request Sep 22, 2023
This PR updates the documentation regarding the changed default value of the schedule_interval parameter of the compression_policy (see timescale/timescaledb#6102).
jnidzwetzki added a commit to jnidzwetzki/docs-timescale that referenced this pull request Sep 22, 2023
This PR updates the documentation regarding the changed default value of the schedule_interval parameter of the compression_policy (see timescale/timescaledb#6102).
@jnidzwetzki
Copy link
Contributor Author

Docs PR: timescale/docs#2698

@jnidzwetzki jnidzwetzki merged commit 683e2bc into timescale:main Sep 22, 2023
35 of 36 checks passed
@jnidzwetzki jnidzwetzki deleted the decrease_compression_schedule_interval branch September 22, 2023 10:31
svenklemm added a commit that referenced this pull request Sep 25, 2023
This release contains performance improvements for compressed hypertables
and continuous aggregates and bug fixes since the 2.11.2 release.
We recommend that you upgrade at the next available opportunity.

This release moves all internal functions from the _timescaleb_internal
schema into the _timescaledb_functions schema. This separates code from
internal data objects and improves security by allowing more restrictive
permissions for the code schema. If you are calling any of those internal
functions you should adjust your code as soon as possible. This version
also includes a compatibility layer that allows calling them in the old
location but that layer will be removed in 2.14.0.

**PostgreSQL 12 support removal announcement**
Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10,
PostgreSQL 12 is not supported starting with TimescaleDB 2.12.
Currently supported PostgreSQL major versions are 13, 14 and 15.
PostgreSQL 16 support will be added with a following TimescaleDB release.

**Features**
* #5137 Insert into index during chunk compression
* #5150 MERGE support on hypertables
* #5515 Make hypertables support replica identity
* #5586 Index scan support during UPDATE/DELETE on compressed hypertables
* #5596 Support for partial aggregations at chunk level
* #5599 Enable ChunkAppend for partially compressed chunks
* #5655 Improve the number of parallel workers for decompression
* #5758 Enable altering job schedule type through `alter_job`
* #5805 Make logrepl markers for (partial) decompressions
* #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate
* #5839 Support CAgg names in chunk_detailed_size
* #5852 Make set_chunk_time_interval CAggs aware
* #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates)
* #5875 Add job exit status and runtime to log
* #5909 CREATE INDEX ONLY ON hypertable creates index on chunks

**Bugfixes**
* #5860 Fix interval calculation for hierarchical CAggs
* #5894 Check unique indexes when enabling compression
* #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows
* #5988 Move functions to _timescaledb_functions schema
* #5788 Chunk_create must add an existing table or fail
* #5872 Fix duplicates on partially compressed chunk reads
* #5918 Fix crash in COPY from program returning error
* #5990 Place data in first/last function in correct mctx
* #5991 Call eq_func correctly in time_bucket_gapfill
* #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output
* #6035 Fix server crash on UPDATE of compressed chunk
* #6044 Fix server crash when using duplicate segmentby column
* #6045 Fix segfault in set_integer_now_func
* #6053 Fix approximate_row_count for CAggs
* #6081 Improve compressed DML datatype handling
* #6084 Propagate parameter changes to decompress child nodes
* #6102 Schedule compression policy more often

**Thanks**
* @ajcanterbury for reporting a problem with lateral joins on compressed chunks
* @alexanderlaw for reporting multiple server crashes
* @lukaskirner for reporting a bug with monthly continuous aggregates
* @mrksngl for reporting a bug with unusual user names
* @willsbit for reporting a crash in time_bucket_gapfill
svenklemm added a commit that referenced this pull request Sep 25, 2023
This release contains performance improvements for compressed hypertables
and continuous aggregates and bug fixes since the 2.11.2 release.
We recommend that you upgrade at the next available opportunity.

This release moves all internal functions from the _timescaleb_internal
schema into the _timescaledb_functions schema. This separates code from
internal data objects and improves security by allowing more restrictive
permissions for the code schema. If you are calling any of those internal
functions you should adjust your code as soon as possible. This version
also includes a compatibility layer that allows calling them in the old
location but that layer will be removed in 2.14.0.

**PostgreSQL 12 support removal announcement**
Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10,
PostgreSQL 12 is not supported starting with TimescaleDB 2.12.
Currently supported PostgreSQL major versions are 13, 14 and 15.
PostgreSQL 16 support will be added with a following TimescaleDB release.

**Features**
* #5137 Insert into index during chunk compression
* #5150 MERGE support on hypertables
* #5515 Make hypertables support replica identity
* #5586 Index scan support during UPDATE/DELETE on compressed hypertables
* #5596 Support for partial aggregations at chunk level
* #5599 Enable ChunkAppend for partially compressed chunks
* #5655 Improve the number of parallel workers for decompression
* #5758 Enable altering job schedule type through `alter_job`
* #5805 Make logrepl markers for (partial) decompressions
* #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate
* #5839 Support CAgg names in chunk_detailed_size
* #5852 Make set_chunk_time_interval CAggs aware
* #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates)
* #5875 Add job exit status and runtime to log
* #5909 CREATE INDEX ONLY ON hypertable creates index on chunks

**Bugfixes**
* #5860 Fix interval calculation for hierarchical CAggs
* #5894 Check unique indexes when enabling compression
* #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows
* #5988 Move functions to _timescaledb_functions schema
* #5788 Chunk_create must add an existing table or fail
* #5872 Fix duplicates on partially compressed chunk reads
* #5918 Fix crash in COPY from program returning error
* #5990 Place data in first/last function in correct mctx
* #5991 Call eq_func correctly in time_bucket_gapfill
* #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output
* #6035 Fix server crash on UPDATE of compressed chunk
* #6044 Fix server crash when using duplicate segmentby column
* #6045 Fix segfault in set_integer_now_func
* #6053 Fix approximate_row_count for CAggs
* #6081 Improve compressed DML datatype handling
* #6084 Propagate parameter changes to decompress child nodes
* #6102 Schedule compression policy more often

**Thanks**
* @ajcanterbury for reporting a problem with lateral joins on compressed chunks
* @alexanderlaw for reporting multiple server crashes
* @lukaskirner for reporting a bug with monthly continuous aggregates
* @mrksngl for reporting a bug with unusual user names
* @willsbit for reporting a crash in time_bucket_gapfill
jnidzwetzki added a commit to jnidzwetzki/docs-timescale that referenced this pull request Sep 27, 2023
This PR updates the documentation regarding the changed default value of the schedule_interval parameter of the compression_policy (see timescale/timescaledb#6102).
jnidzwetzki added a commit to timescale/docs that referenced this pull request Sep 27, 2023
This PR updates the documentation regarding the changed default value of the schedule_interval parameter of the compression_policy (see timescale/timescaledb#6102).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants