Support for partial aggregations at chunk level #5596

jnidzwetzki · 2023-04-20T13:52:10Z

This patch adds support for partial aggregations at the chunk level. The aggregation is replanned in the create_upper_paths_hook of PostgreSQL. The AggPath is split up into multiple AGGSPLIT_INITIAL_SERIAL operations (one on top of each chunk), which create partials, and one AGGSPLIT_FINAL_DESERIAL operation, which finalizes the aggregation.

Benchmark

https://grafana.ops.savannah-dev.timescale.com/d/NdmLnOk4z/compare-benchmark-runs?orgId=1&var-branch=main&var-run1=2699&var-run2=2700&var-threshold=0.02

Speedup of the aggregations ~ 10-15%. The result shows one regression, but the regression is flaky and previous runs show the same execution time.

codecov · 2023-08-28T12:31:53Z

Codecov Report

Merging #5596 (844412c) into main (23b51c9) will increase coverage by 0.07%.
The diff coverage is 94.70%.

@@            Coverage Diff             @@
##             main    #5596      +/-   ##
==========================================
+ Coverage   81.44%   81.51%   +0.07%     
==========================================
  Files         246      246              
  Lines       56197    56531     +334     
  Branches    12460    12516      +56     
==========================================
+ Hits        45767    46080     +313     
+ Misses       8097     8086      -11     
- Partials     2333     2365      +32

Files Changed	Coverage Δ
src/nodes/chunk_append/planner.c	`90.17% <ø> (+3.46%)`	⬆️
tsl/src/nodes/decompress_chunk/decompress_chunk.c	`88.64% <0.00%> (-0.33%)`	⬇️
src/planner/partialize.c	`94.73% <95.48%> (+2.87%)`	⬆️
src/guc.c	`96.70% <100.00%> (+0.03%)`	⬆️
src/planner/planner.c	`90.48% <100.00%> (+0.02%)`	⬆️

... and 23 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

jnidzwetzki · 2023-09-08T10:07:22Z

@erimatnor I addressed the comments/questions you added. Could you do a re-review of the PR?

So far, we have created fake partitioning info for hypertables if the PostgreSQL setting 'enable_partitionwise_aggregate' is set. This causes PostgreSQL to push down partial aggregations to the chunk level. However, the PostgreSQL code has some drawbacks because the query is replanned and optimizations like ChunkAppend are lost. Since timescale#5596 we have implemented our own code to push down partial aggregations. Therefore, we can ignore the PostgreSQL setting from now on.

erimatnor

Approving. The patch looks fine and seems to do the right thing based on test output.

However, there are some things I am not sure I understand still, which boils down to some opaqueness in the code I think. Some of the opaqueness is probably inherited from PostgreSQL, but I don't think that is a good reason why our code should also be hard to understand.

I put some discussion comments inline. I am mostly looking for clarification and validation that my intuition is correct and that the code does what I think it is doing. But ideally, I should not have to "guess" to understand the code.

erimatnor · 2023-09-13T07:13:54Z

src/planner/partialize.c

+ * Get all subpaths from a Append, MergeAppend, or ChunkAppend path
+ */
+static List *
+get_subpaths_from_append_path(Path *path, bool handle_gather_path)


Just a suggestion: Maybe it is more intuitive to understand this if it is generalized to something like a max_recurse integer that you decrement every recursive call and stop recursion when hitting 0?

You might also want to expand the comment to explain special handling of GatherPath.

With regards to reason (1) above, I am wondering why there are gather nodes in non-partial paths? Or is this just a check for an error condition?

erimatnor · 2023-09-13T07:19:19Z

src/planner/partialize.c

+{
+	List *subpaths = get_subpaths_from_append_path(path, true);
+
+	Ensure(subpaths != NIL, "Unable to determine aggregation type");


Suggested change

Ensure(subpaths != NIL, "Unable to determine aggregation type");

Ensure(subpaths != NIL, "unable to determine aggregation type");

erimatnor · 2023-09-13T07:26:57Z

src/planner/partialize.c

+	if (!is_sorted)
+	{
+		path = (Path *) create_sort_path(root, path->parent, path, root->group_pathkeys, -1.0);
+	}


I hear you and I've seen that too.

Sorry to belabor the point, however, but to clarify: I was not correct using the term "shadowing" since a new variable with same name is not defined, so I wasn't clear in my comment.

What I meant was that it might be easier to follow what the code does with aptly named variables instead of reusing the generically named path variable. The intention of the code here is to ensure that the Path is returning ordered data, which can be achieved by the path itself or by adding a sort path on top. Reusing "path" makes it more opaque what path node we are dealing with and what the code did in the previous step.

Further below, and elsewhere in the patch, you don't reuse path but instead use a new variable name to indicate similar intention, e.g., via a new sorted_agg_path . I am simply arguing for the same intention-by-name convention here to more easily understand what is going on in the code. In particular, in big functions it becomes increasingly hard to understand what type of path a path variable is at a specific line.

As a reviewer, I don't see (or even know) that this code originates from PostgreSQL, so I make suggestions I think will improve the understandability and maintainability of the code irrespective of code origin.

Thus, IMHO, let's not make PostgreSQL and (maybe) future copy-pasting an argument for not making changes to our code based on reviewer suggestions.

Code might change upstream as well, making future code adoption hard in other ways. Our outset should be that we own and understand what this code does, allowing us to make the changes we need.

erimatnor · 2023-09-13T08:00:10Z

src/planner/partialize.c

+
+	if (sorted_subpaths != NIL)
+	{
+		add_path(partially_grouped_rel,


Not sure I follow. Doesn't all partial paths require finalization with a Gather node, regardless of whether they are parallel or not?

I am just trying to understand whether we are adding paths to the right path "list" as there are two "pathlists" in a RelOptInfo. One rel->pathlist and one rel->partial_pathlist. I don't recall the difference between these two so I am asking to ensure we are doing the right thing here.

erimatnor · 2023-09-13T08:02:47Z

src/planner/partialize.c

+}
+
+/*
+ * Generate a total aggregation path for partial aggregations


What is a "total aggregation" path? Is it one that includes the Gather node to make it "total" or is it a non-partial path?

erimatnor · 2023-09-13T08:19:27Z

src/planner/partialize.c

+							   d_num_groups,
+							   extra_data);
+
+	/* The same as above but for partial paths */


I find this confusing: We just generated push-down paths above, which in my understanding are also partial paths. (Please correct me if I am wrong.) But here it sounds like the partial paths are generated here, implying it was not done in the previous step.

I think what is happening is that we are conflating partial and parallel. (And maybe PostgreSQL does too.)

So, in my understanding, we can have partial paths that are both parallel and non-parallel. Thus:

We use non-parallel partials to push down aggregates to individual chunk relations ("partitionwise" agg).

We can also generate parallel execution paths, which also require partials since they execute in different workers ("parallel" agg).

Thus, I think in the previous step we generated partial push-down paths that are non-parallel. Here we generate partial paths to be executed in parallel (given the consider_parallel check).

Unless I am wrong, I think PostgreSQL uses rel->pathlist to store non-parallel paths and rel->partial_pathlist to store parallel paths although both can contain partials if we do partitionwise aggregation (which makes the naming super-confusing). I guess PostgreSQL's naming convention here is confusing and legacy and probably a result of initially only using partial paths in the parallel case.

Currently, I think it is hard to understand the difference between what generate_agg_pushdown_path does above compared to generate_partial_agg_pushdown_path below. And I am not even sure my understanding is correct here. Is there a way to clarify in function names and comments?

I am wondering if it would not be more clear to say that above we generate "partitionwise" paths, while here we generate the corresponding "parallel" paths?

jnidzwetzki · 2023-09-14T06:32:49Z

Improved code comments and discussed the open questions with @erimatnor. We agreed that the PR could be merged now.

This patch adds support for partial aggregations at the chunk level. The aggregation is replanned in the create_upper_paths_hook of PostgreSQL. The AggPath is split up into multiple AGGSPLIT_INITIAL_SERIAL operations (one on top of each chunk), which create partials, and one AGGSPLIT_FINAL_DESERIAL operation, which finalizes the aggregation.

So far, we have created fake partitioning info for hypertables if the PostgreSQL setting 'enable_partitionwise_aggregate' is set. This causes PostgreSQL to push down partial aggregations to the chunk level. However, the PostgreSQL code has some drawbacks because the query is replanned and optimizations like ChunkAppend are lost. Since timescale#5596 we have implemented our own code to push down partial aggregations. Therefore, we can ignore the PostgreSQL setting from now on.

alexanderlaw · 2023-09-14T12:31:21Z

Please look at the following query, that triggers an assertion failure for me since ba9b818:

CREATE TABLE t(time timestamptz NOT NULL, device_id int);
CREATE INDEX ON t(device_id,time);
SELECT create_hypertable('t','time');
INSERT INTO t(time,device_id) VALUES('2023-01-01', 1);
SET enable_seqscan = off;
EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;

===

Core was generated by `postgres: law regression [local] EXPLAIN                                      '.
Program terminated with signal SIGABRT, Aborted.

warning: Section `.reg-xstate/2457168' in core file too small.
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140165221895552)
    at ./nptl/pthread_kill.c:44
44      ./nptl/pthread_kill.c: No such file or directory.
(gdb) bt
#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=140165221895552)
    at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=140165221895552) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=140165221895552, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f7ac1a4d476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#4  0x00007f7ac1a337f3 in __GI_abort () at ./stdlib/abort.c:79
#5  0x000055989c7218a8 in ExceptionalCondition (
    conditionName=conditionName@entry=0x55989c8891ae "apath->path.pathkeys == NIL", 
    errorType=errorType@entry=0x55989c77e00b "FailedAssertion", 
    fileName=fileName@entry=0x55989c889168 "costsize.c", lineNumber=lineNumber@entry=2276)
    at assert.c:69
#6  0x000055989c4c0b93 in cost_append (apath=apath@entry=0x55989db38bb0) at costsize.c:2276
#7  0x00007f7ab8977672 in copy_append_path (path=path@entry=0x55989db362a0, 
    subpaths=subpaths@entry=0x55989db38a68) at .../timescaledb/src/planner/partialize.c:271
#8  0x00007f7ab8977d6e in copy_append_like_path (root=root@entry=0x55989db04e90, 
    path=path@entry=0x55989db362a0, new_subpaths=0x55989db38a68, 
    pathtarget=pathtarget@entry=0x55989db381e8)
    at .../timescaledb/src/planner/partialize.c:308
#9  0x00007f7ab8978208 in generate_partial_agg_pushdown_path (root=root@entry=0x55989db04e90, 
    cheapest_partial_path=0x55989db362a0, output_rel=output_rel@entry=0x55989db21800, 
    partially_grouped_rel=partially_grouped_rel@entry=0x55989db37730, 
    grouping_target=grouping_target@entry=0x55989db36ec0, 
    partial_grouping_target=partial_grouping_target@entry=0x55989db381e8, can_sort=true, can_hash=true, 
    d_num_groups=d_num_groups@entry=200, extra_data=0x7ffeea7500a0)
    at .../timescaledb/src/planner/partialize.c:596
#10 0x00007f7ab8978768 in ts_pushdown_partial_agg (root=root@entry=0x55989db04e90, ht=<optimized out>, 
    input_rel=input_rel@entry=0x55989db04ba0, output_rel=output_rel@entry=0x55989db21800, 
    extra=extra@entry=0x7ffeea7500a0) at .../timescaledb/src/planner/partialize.c:803
#11 0x00007f7ab89703ea in timescaledb_create_upper_paths_hook (root=0x55989db04e90, 
    stage=UPPERREL_GROUP_AGG, input_rel=0x55989db04ba0, output_rel=0x55989db21800, extra=0x7ffeea7500a0)
    at .../timescaledb/src/planner/planner.c:1588
#12 0x000055989c4eea18 in create_ordinary_grouping_paths (root=root@entry=0x55989db04e90, 
    input_rel=input_rel@entry=0x55989db04ba0, grouped_rel=grouped_rel@entry=0x55989db21800, 
    agg_costs=agg_costs@entry=0x7ffeea750070, gd=gd@entry=0x0, extra=extra@entry=0x7ffeea7500a0, 
    partially_grouped_rel_p=0x7ffeea750068) at planner.c:3722
#13 0x000055989c4eeca8 in create_grouping_paths (root=root@entry=0x55989db04e90, 
    input_rel=input_rel@entry=0x55989db04ba0, target=target@entry=0x55989db36ec0, 
    target_parallel_safe=target_parallel_safe@entry=true, gd=gd@entry=0x0) at planner.c:3442
#14 0x000055989c4efabb in grouping_planner (root=root@entry=0x55989db04e90, 
    tuple_fraction=<optimized out>, tuple_fraction@entry=0) at planner.c:1614
#15 0x000055989c4f0d71 in subquery_planner (glob=glob@entry=0x55989db04a58, 
    parse=parse@entry=0x55989d9f5560, parent_root=parent_root@entry=0x0, 
    hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:1047
#16 0x000055989c4f138b in standard_planner (parse=parse@entry=0x55989d9f5560, 
    query_string=query_string@entry=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:408
#17 0x00007f7ab896f1bf in timescaledb_planner (parse=0x55989d9f5560, 
    query_string=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", 
    cursor_opts=2048, bound_params=0x0) at .../timescaledb/src/planner/planner.c:571
#18 0x000055989c4f194f in planner (parse=parse@entry=0x55989d9f5560, 
    query_string=query_string@entry=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at planner.c:275
#19 0x000055989c5da58f in pg_plan_query (querytree=querytree@entry=0x55989d9f5560, 
    query_string=query_string@entry=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:883
#20 0x000055989c39b739 in ExplainOneQuery (query=0x55989d9f5560, 
    cursorOptions=cursorOptions@entry=2048, into=into@entry=0x0, es=es@entry=0x55989db308c8, 
    queryString=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", 
    params=params@entry=0x0, queryEnv=0x0) at explain.c:397
#21 0x000055989c39c2aa in ExplainQuery (pstate=pstate@entry=0x55989db30a20, 
    stmt=stmt@entry=0x55989d9f5380, params=params@entry=0x0, dest=dest@entry=0x55989db305b8)
    at ../../../src/include/nodes/nodes.h:610
#22 0x000055989c5e0ad3 in standard_ProcessUtility (pstmt=0x55989d9f5e30, 
    queryString=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", 
    readOnlyTree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, 
    dest=0x55989db305b8, qc=0x7ffeea750ca0) at utility.c:870
#23 0x00007f7ac24814c8 in loader_process_utility_hook (pstmt=0x55989d9f5e30, 
    query_string=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;", 
    readonly_tree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, params=0x0, queryEnv=0x0, 
    dest=0x55989db305b8, completion_tag=0x7ffeea750ca0)
    at .../timescaledb/src/loader/loader.c:639
#24 0x00007f7ab893616e in prev_ProcessUtility (args=args@entry=0x7ffeea750b80)
    at .../timescaledb/src/process_utility.c:100
#25 0x00007f7ab893b1da in timescaledb_ddl_command_start (pstmt=0x55989d9f5e30, 
    query_string=<optimized out>, readonly_tree=<optimized out>, context=PROCESS_UTILITY_TOPLEVEL, 
    params=0x0, queryEnv=<optimized out>, dest=0x55989db305b8, completion_tag=0x7ffeea750ca0)
    at .../timescaledb/src/process_utility.c:4538
#26 0x000055989c5e0f11 in ProcessUtility (pstmt=pstmt@entry=0x55989d9f5e30, 
    queryString=<optimized out>, readOnlyTree=<optimized out>, 
    context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=<optimized out>, queryEnv=<optimized out>, 
    dest=0x55989db305b8, qc=0x7ffeea750ca0) at utility.c:526
#27 0x000055989c5de328 in PortalRunUtility (portal=portal@entry=0x55989da67260, pstmt=0x55989d9f5e30, 
    isTopLevel=<optimized out>, setHoldSnapshot=setHoldSnapshot@entry=true, 
    dest=dest@entry=0x55989db305b8, qc=qc@entry=0x7ffeea750ca0) at pquery.c:1158
#28 0x000055989c5de823 in FillPortalStore (portal=portal@entry=0x55989da67260, 
    isTopLevel=isTopLevel@entry=true) at ../../../src/include/nodes/nodes.h:610
#29 0x000055989c5dec11 in PortalRun (portal=portal@entry=0x55989da67260, 
    count=count@entry=9223372036854775807, isTopLevel=isTopLevel@entry=true, 
    run_once=run_once@entry=true, dest=dest@entry=0x55989dae8790, altdest=altdest@entry=0x55989dae8790, 
    qc=0x7ffeea750e90) at pquery.c:763
#30 0x000055989c5dac2c in exec_simple_query (
    query_string=query_string@entry=0x55989d9f4340 "EXPLAIN SELECT device_id, count(*) FROM t GROUP BY device_id;") at postgres.c:1250
#31 0x000055989c5dcad4 in PostgresMain (dbname=<optimized out>, username=<optimized out>)
    at postgres.c:4598
#32 0x000055989c53719e in BackendRun (port=port@entry=0x55989da1aa00) at postmaster.c:4514
#33 0x000055989c53a2a1 in BackendStartup (port=port@entry=0x55989da1aa00) at postmaster.c:4242
#34 0x000055989c53a4da in ServerLoop () at postmaster.c:1809
#35 0x000055989c53bac2 in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0x55989d95cad0)
    at postmaster.c:1481
#36 0x000055989c47fca8 in main (argc=3, argv=0x55989d95cad0) at main.c:202

So far, we have created fake partitioning info for hypertables if the PostgreSQL setting 'enable_partitionwise_aggregate' is set. This causes PostgreSQL to push down partial aggregations to the chunk level. However, the PostgreSQL code has some drawbacks because the query is replanned and optimizations like ChunkAppend are lost. Since timescale#5596 we have implemented our own code to push down partial aggregations. Therefore, we can ignore the PostgreSQL setting from now on.

So far, we have created fake partitioning info for hypertables if the PostgreSQL setting 'enable_partitionwise_aggregate' is set. This causes PostgreSQL to push down partial aggregations to the chunk level. However, the PostgreSQL code has some drawbacks because the query is replanned and optimizations like ChunkAppend are lost. Since #5596 we have implemented our own code to push down partial aggregations. Therefore, we can ignore the PostgreSQL setting from now on.

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * timescale#5137 Insert into index during chunk compression * timescale#5150 MERGE support on hypertables * timescale#5515 Make hypertables support replica identity * timescale#5586 Index scan support during UPDATE/DELETE on compressed hypertables * timescale#5596 Support for partial aggregations at chunk level * timescale#5599 Enable ChunkAppend for partially compressed chunks * timescale#5655 Improve the number of parallel workers for decompression * timescale#5758 Enable altering job schedule type through `alter_job` * timescale#5805 Make logrepl markers for (partial) decompressions * timescale#5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * timescale#5839 Support CAgg names in chunk_detailed_size * timescale#5852 Make set_chunk_time_interval CAggs aware * timescale#5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * timescale#5875 Add job exit status and runtime to log * timescale#5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * timescale#5860 Fix interval calculation for hierarchical CAggs * timescale#5894 Check unique indexes when enabling compression * timescale#5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * timescale#5988 Move functions to _timescaledb_functions schema * timescale#5788 Chunk_create must add an existing table or fail * timescale#5872 Fix duplicates on partially compressed chunk reads * timescale#5918 Fix crash in COPY from program returning error * timescale#5990 Place data in first/last function in correct mctx * timescale#5991 Call eq_func correctly in time_bucket_gapfill * timescale#6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * timescale#6035 Fix server crash on UPDATE of compressed chunk * timescale#6044 Fix server crash when using duplicate segmentby column * timescale#6045 Fix segfault in set_integer_now_func * timescale#6053 Fix approximate_row_count for CAggs * timescale#6081 Improve compressed DML datatype handling * timescale#6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes * #6102 Schedule compression policy more often **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

@ajcanterbury

This release contains performance improvements for compressed hypertables and continuous aggregates and bug fixes since the 2.11.2 release. We recommend that you upgrade at the next available opportunity. This release moves all internal functions from the _timescaleb_internal schema into the _timescaledb_functions schema. This separates code from internal data objects and improves security by allowing more restrictive permissions for the code schema. If you are calling any of those internal functions you should adjust your code as soon as possible. This version also includes a compatibility layer that allows calling them in the old location but that layer will be removed in 2.14.0. **PostgreSQL 12 support removal announcement** Following the deprecation announcement for PostgreSQL 12 in TimescaleDB 2.10, PostgreSQL 12 is not supported starting with TimescaleDB 2.12. Currently supported PostgreSQL major versions are 13, 14 and 15. PostgreSQL 16 support will be added with a following TimescaleDB release. **Features** * #5137 Insert into index during chunk compression * #5150 MERGE support on hypertables * #5515 Make hypertables support replica identity * #5586 Index scan support during UPDATE/DELETE on compressed hypertables * #5596 Support for partial aggregations at chunk level * #5599 Enable ChunkAppend for partially compressed chunks * #5655 Improve the number of parallel workers for decompression * #5758 Enable altering job schedule type through `alter_job` * #5805 Make logrepl markers for (partial) decompressions * #5809 Relax invalidation threshold table-level lock to row-level when refreshing a Continuous Aggregate * #5839 Support CAgg names in chunk_detailed_size * #5852 Make set_chunk_time_interval CAggs aware * #5868 Allow ALTER TABLE ... REPLICA IDENTITY (FULL|INDEX) on materialized hypertables (continuous aggregates) * #5875 Add job exit status and runtime to log * #5909 CREATE INDEX ONLY ON hypertable creates index on chunks **Bugfixes** * #5860 Fix interval calculation for hierarchical CAggs * #5894 Check unique indexes when enabling compression * #5951 _timescaledb_internal.create_compressed_chunk doesn't account for existing uncompressed rows * #5988 Move functions to _timescaledb_functions schema * #5788 Chunk_create must add an existing table or fail * #5872 Fix duplicates on partially compressed chunk reads * #5918 Fix crash in COPY from program returning error * #5990 Place data in first/last function in correct mctx * #5991 Call eq_func correctly in time_bucket_gapfill * #6015 Correct row count in EXPLAIN ANALYZE INSERT .. ON CONFLICT output * #6035 Fix server crash on UPDATE of compressed chunk * #6044 Fix server crash when using duplicate segmentby column * #6045 Fix segfault in set_integer_now_func * #6053 Fix approximate_row_count for CAggs * #6081 Improve compressed DML datatype handling * #6084 Propagate parameter changes to decompress child nodes * #6102 Schedule compression policy more often **Thanks** * @ajcanterbury for reporting a problem with lateral joins on compressed chunks * @alexanderlaw for reporting multiple server crashes * @lukaskirner for reporting a bug with monthly continuous aggregates * @mrksngl for reporting a bug with unusual user names * @willsbit for reporting a crash in time_bucket_gapfill

github-actions bot assigned jnidzwetzki Apr 20, 2023

jnidzwetzki force-pushed the partitionwise_aggregate branch 2 times, most recently from be248e9 to e1ea7ce Compare April 26, 2023 12:39

jnidzwetzki force-pushed the partitionwise_aggregate branch 3 times, most recently from c835fa8 to 7bde4fb Compare May 5, 2023 13:16

jnidzwetzki force-pushed the partitionwise_aggregate branch 3 times, most recently from 7454af4 to a1a17ef Compare May 22, 2023 06:55

jnidzwetzki force-pushed the partitionwise_aggregate branch from 8487617 to ec264b3 Compare May 23, 2023 09:19

jnidzwetzki changed the title ~~CI Test - partition-wise aggregation~~ Support for partial aggregations at chunk level Aug 23, 2023

jnidzwetzki force-pushed the partitionwise_aggregate branch 6 times, most recently from 3b63e30 to 7bd1a7c Compare August 24, 2023 20:44

jnidzwetzki force-pushed the partitionwise_aggregate branch 12 times, most recently from 09d0bb4 to a0db2f3 Compare August 29, 2023 13:52

jnidzwetzki requested a review from erimatnor September 8, 2023 10:06

svenklemm approved these changes Sep 11, 2023

View reviewed changes

jnidzwetzki mentioned this pull request Sep 12, 2023

[Bug]: Using now() selects way more chunks than using timestamp literals when enable_partitionwise_aggregate=on #6059

Closed

jnidzwetzki mentioned this pull request Sep 12, 2023

Don't build partition info for local hypertables #6065

Merged

svenklemm added this to the TimescaleDB 2.12 milestone Sep 12, 2023

jnidzwetzki force-pushed the partitionwise_aggregate branch 2 times, most recently from 54e57ce to d77f95a Compare September 12, 2023 13:47

erimatnor approved these changes Sep 13, 2023

View reviewed changes

jnidzwetzki force-pushed the partitionwise_aggregate branch 2 times, most recently from cf24afe to 2608e04 Compare September 14, 2023 06:32

jnidzwetzki enabled auto-merge (rebase) September 14, 2023 06:32

jnidzwetzki force-pushed the partitionwise_aggregate branch from 2608e04 to 844412c Compare September 14, 2023 06:53

jnidzwetzki merged commit ba9b818 into timescale:main Sep 14, 2023
34 checks passed

svenklemm mentioned this pull request Sep 20, 2023

Release 2.12.0 #6086

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for partial aggregations at chunk level #5596

Support for partial aggregations at chunk level #5596

jnidzwetzki commented Apr 20, 2023 •

edited

Loading

codecov bot commented Aug 28, 2023 •

edited

Loading

jnidzwetzki commented Sep 8, 2023

erimatnor left a comment •

edited

Loading

erimatnor Sep 13, 2023

erimatnor Sep 13, 2023

erimatnor Sep 13, 2023

erimatnor Sep 13, 2023

erimatnor Sep 13, 2023

erimatnor Sep 13, 2023

jnidzwetzki commented Sep 14, 2023

alexanderlaw commented Sep 14, 2023 •

edited

Loading

	Ensure(subpaths != NIL, "Unable to determine aggregation type");
	Ensure(subpaths != NIL, "unable to determine aggregation type");

Support for partial aggregations at chunk level #5596

Support for partial aggregations at chunk level #5596

Conversation

jnidzwetzki commented Apr 20, 2023 • edited Loading

Benchmark

codecov bot commented Aug 28, 2023 • edited Loading

Codecov Report

jnidzwetzki commented Sep 8, 2023

erimatnor left a comment • edited Loading

Choose a reason for hiding this comment

erimatnor Sep 13, 2023

Choose a reason for hiding this comment

erimatnor Sep 13, 2023

Choose a reason for hiding this comment

erimatnor Sep 13, 2023

Choose a reason for hiding this comment

erimatnor Sep 13, 2023

Choose a reason for hiding this comment

erimatnor Sep 13, 2023

Choose a reason for hiding this comment

erimatnor Sep 13, 2023

Choose a reason for hiding this comment

jnidzwetzki commented Sep 14, 2023

alexanderlaw commented Sep 14, 2023 • edited Loading

jnidzwetzki commented Apr 20, 2023 •

edited

Loading

codecov bot commented Aug 28, 2023 •

edited

Loading

erimatnor left a comment •

edited

Loading

alexanderlaw commented Sep 14, 2023 •

edited

Loading