Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create compressed chunks at insert time #5849

Closed
wants to merge 1 commit into from

Conversation

antekresic
Copy link
Contributor

To avoid lock contention during compression operation, we are moving compressed chunk creation together with uncompressed chunk creation during insert time. Now compressed chunks live while the uncompressed chunk is alive, we don't remove them during decompression but rather truncate them. This moves lock contention over compressed hypertable to coincide with lock contention over uncompressed hypertable.

@sb230132
Copy link
Contributor

sb230132 commented Jul 3, 2023

Execution time of INSERT and compress_chunk() is provided below:

On create-compressed-chunks branch:

test=# \timing
Timing is on.
test=# SELECT '2022-10-10 14:33:44.1234+05:30' as start_date \gset
Time: 1.659 ms
test=# INSERT INTO metric_5m (time, series_id, value)
test-#     SELECT t, s,1 from generate_series(:'start_date'::timestamptz, :'start_date'::timestamptz + interval '1 day', '10s') t cross join generate_series(1,10, 1) s;
INSERT 0 86410
Time: 1195.149 ms (00:01.195)
test=# SELECT count(compress_chunk(c.schema_name|| '.' || c.table_name))
test-#     FROM _timescaledb_catalog.chunk c, _timescaledb_catalog.hypertable ht where
test-#     c.hypertable_id = ht.id and ht.table_name = 'metric_5m' and c.status & 1 = 0;
 count
-------
   289
(1 row)

Time: 453.254 ms

On main branch:

test=# \timing
Timing is on.
test=# SELECT '2022-10-10 14:33:44.1234+05:30' as start_date \gset
Time: 1.091 ms
test=# INSERT INTO metric_5m (time, series_id, value)
test-#     SELECT t, s,1 from generate_series(:'start_date'::timestamptz, :'start_date'::timestamptz + interval '1 day', '10s') t cross join generate_series(1,10, 1) s;
INSERT 0 86410
Time: 520.860 ms
test=# SELECT count(compress_chunk(c.schema_name|| '.' || c.table_name))
test-#     FROM _timescaledb_catalog.chunk c, _timescaledb_catalog.hypertable ht where
test-#     c.hypertable_id = ht.id and ht.table_name = 'metric_5m' and c.status & 1 = 0;
 count
-------
   289
(1 row)

Time: 940.169 ms

Clearly visible that locking contention has been shifted to INSERT.

@sb230132 sb230132 force-pushed the create-compressed-chunks branch 2 times, most recently from 3f50b5e to da6e542 Compare July 4, 2023 04:11
@codecov
Copy link

codecov bot commented Jul 4, 2023

Codecov Report

Merging #5849 (da6e542) into main (e66a400) will decrease coverage by 1.00%.
The diff coverage is 84.00%.

❗ Current head da6e542 differs from pull request most recent head 781b39f. Consider uploading reports for the commit 781b39f to get more accurate results

@@            Coverage Diff             @@
##             main    #5849      +/-   ##
==========================================
- Coverage   81.37%   80.37%   -1.00%     
==========================================
  Files         243      239       -4     
  Lines       55941    48782    -7159     
  Branches    12385    12210     -175     
==========================================
- Hits        45520    39208    -6312     
+ Misses       8095     4155    -3940     
- Partials     2326     5419    +3093     
Files Changed Coverage Δ
tsl/src/planner.c 91.42% <50.00%> (+8.38%) ⬆️
src/chunk.c 83.99% <79.41%> (-5.86%) ⬇️
tsl/src/compression/api.c 85.87% <100.00%> (-5.08%) ⬇️
tsl/src/compression/compression.c 83.98% <100.00%> (-7.53%) ⬇️
tsl/src/compression/create.c 86.48% <100.00%> (-7.00%) ⬇️

... and 227 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@antekresic antekresic force-pushed the create-compressed-chunks branch 10 times, most recently from 4f33a4f to 862f183 Compare August 24, 2023 13:39
@antekresic antekresic marked this pull request as ready for review August 24, 2023 13:59
@github-actions github-actions bot requested review from erimatnor and mkindahl August 24, 2023 13:59
@github-actions
Copy link

@erimatnor, @mkindahl: please review this pull request.

Powered by pull-review

Copy link
Contributor

@sb230132 sb230132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Title of PR is not completely true.
Create compressed chunks when enabling compression or when creating new chunks during INSERTS.

src/chunk.c Outdated
Comment on lines 1516 to 1585
/* If we have a compressed hypertable set, lock up the compressed hypertable
* so we can create the compressed chunk too
*/
if (ht->fd.compressed_hypertable_id != 0)
{
compress_ht = ts_hypertable_get_by_id(ht->fd.compressed_hypertable_id);
Assert(compress_ht);
LockRelationOid(compress_ht->main_table_relid, ShareUpdateExclusiveLock);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If iam not wrong, this code is hit during INSERT. If so, by this time we would have already created compressed chunks if compression is enabled on the hypertable.
Why do we need this code here ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking further down the call chain, I get the following (chunk_create_from_point_after_lock@1574 -> chunk_create_from_hypercube_after_lock@1402):

(gdb) l
1216             * to reduce locking contention since we already use heavy locks to attach
1217             * chunks to the hypertable.
1218             */
1219            if (ht->fd.compressed_hypertable_id != 0)
1220            {
1221                    chunk->fd.compressed_chunk_id =
1222                            create_compressed_chunk(ht->fd.compressed_hypertable_id, schema_name);
1223            }
1224
1225            chunk_add_constraints(chunk);
(gdb) p chunk->fd.compressed_chunk_id 
$6 = 0

So it seems like it is not set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, we are creating a new chunk so we need to create the compressed chunk for it as well since it did not exist when enabling compression on the hypertable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess this if for new chunks that didn't exist when compression was enabled.

I do wonder, however, if we can delay grabbing the ShareUpdateExclusiveLock on the compressed hypertable until we know we actually needed the lock on the main hypertable. See check below for concurrent creation by other backend. In other words, if someone else created the chunk before we grabbed the lock, we need not take the lock on the compressed hypertable at all.

src/chunk.c Outdated
NULL,
chunk_id);

compressed_chunk->constraints = ts_chunk_constraints_alloc(1, CurrentMemoryContext);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you missing indexes on compressed chunk here ? Where are the indexes created on compressed chunks ?

@@ -958,7 +958,7 @@ SET client_min_messages TO NOTICE;
SELECT status FROM _timescaledb_catalog.chunk where table_name = :'new_uncompressed_chunk_name';
status
--------
3
-1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is -1 ? When can the status be -1 ?

('2021-06-18 00:00:00-00', 'London', 27,4),
('2021-06-19 00:00:00-00', 'Moscow', 28,4);
ALTER TABLE conditions SET (timescaledb.compress);
SELECT * FROM _timescaledb_catalog.chunk;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add ORDER BY 1 ,2;
Sometimes the order of rows changes, we can cause the test to fail randomly.

1 | 1 | _timescaledb_internal | _hyper_1_1_chunk | | f | 0 | f
(4 rows)

SELECT _timescaledb_functions.create_compressed_chunks_for_hypertable('conditions');
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO this function should be internal and not exposed to customers.

@mkindahl
Copy link
Contributor

Here is the lock graph when entering create_compressed_chunk (the unnamed relation is the new chunk).

mats_lock_graph gv

Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should optimize for INSERT and SELECT since these are the cases that users are interested in. If these operations have unnecessary blocking, it will be a problem for normal operations. The trade-off in this case is that compression is instead blocking more than necessary, so this pull request seems to make a reasonable trade-off.

I have a first batch of comments here, if you want to handle those, but am going to look more at the code.

src/chunk.c Outdated
@@ -1210,13 +1212,62 @@ chunk_create_from_hypercube_after_lock(const Hypertable *ht, Hypercube *cube,
prefix,
get_next_chunk_id());

/* If we have compressoin enabled, we create the compressed chunk here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* If we have compressoin enabled, we create the compressed chunk here
/* If we have compression enabled, we create the compressed chunk here

src/chunk.c Outdated
Comment on lines 1237 to 1250
NameData *compress_chunk_table_name = palloc0(sizeof(*compress_chunk_table_name));

int namelen = snprintf(NameStr(*compress_chunk_table_name),
NAMEDATALEN,
"compress%s_%d_chunk",
NameStr(compress_ht->fd.associated_table_prefix),
chunk_id);

if (namelen >= NAMEDATALEN)
ereport(ERROR,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("invalid name \"%s\" for compressed chunk",
NameStr(*compress_chunk_table_name)),
errdetail("The associated table prefix is too long.")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Magic constant and duplicated code here. You might want to factor out the code to create the chunk name as well so that you do not duplicate the code here and in create.c.

src/chunk.c Outdated
chunk_add_constraints(chunk);
chunk_insert_into_metadata_after_lock(chunk);
chunk_create_table_constraints(ht, chunk);

return chunk;
}

static int32
create_compressed_chunk(int32 compressed_hypertable_id, const char *schema_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a create_compress_chunk as well. Is there any overlap there? It is a little confusing to have two very similar functions.

Comment on lines 35 to 42
CREATE OR REPLACE FUNCTION _timescaledb_functions.create_compressed_chunks_for_hypertable(
hypertable REGCLASS
) RETURNS BOOL AS '$libdir/timescaledb-2.12.0-dev', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if you want to expose this to the user. If you want to make it available for testing purposes, you can add it to the debug_utils.sql file. Then you do not need to bother about handling upgrades or downgrades.

Comment on lines 132 to 134
CREATE OR REPLACE FUNCTION _timescaledb_functions.create_compressed_chunks_for_hypertable(
hypertable REGCLASS
) RETURNS BOOL AS '$libdir/timescaledb-2.12.0-dev', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;
Copy link
Contributor

@mkindahl mkindahl Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As noted above, if we make this available to users, we need to handle upgrades and downgrades, and the function at this point does not seem to be very useful for a user.

src/chunk.c Outdated
Comment on lines 1516 to 1585
/* If we have a compressed hypertable set, lock up the compressed hypertable
* so we can create the compressed chunk too
*/
if (ht->fd.compressed_hypertable_id != 0)
{
compress_ht = ts_hypertable_get_by_id(ht->fd.compressed_hypertable_id);
Assert(compress_ht);
LockRelationOid(compress_ht->main_table_relid, ShareUpdateExclusiveLock);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking further down the call chain, I get the following (chunk_create_from_point_after_lock@1574 -> chunk_create_from_hypercube_after_lock@1402):

(gdb) l
1216             * to reduce locking contention since we already use heavy locks to attach
1217             * chunks to the hypertable.
1218             */
1219            if (ht->fd.compressed_hypertable_id != 0)
1220            {
1221                    chunk->fd.compressed_chunk_id =
1222                            create_compressed_chunk(ht->fd.compressed_hypertable_id, schema_name);
1223            }
1224
1225            chunk_add_constraints(chunk);
(gdb) p chunk->fd.compressed_chunk_id 
$6 = 0

So it seems like it is not set.

Comment on lines +481 to +484
if (!compress_ht_chunk)
{
elog(ERROR, "cannot compress chunk, missing compressed chunk table");
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (!compress_ht_chunk)
{
elog(ERROR, "cannot compress chunk, missing compressed chunk table");
}
Ensure(compress_ht_chunk, "cannot compress chunk, missing compressed chunk table");

src/chunk.c Outdated
Comment on lines 1263 to 1265
chunk_add_constraints(compressed_chunk);
chunk_insert_into_metadata_after_lock(compressed_chunk);
chunk_create_table_constraints(compress_ht, compressed_chunk);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem to be done both inside create_compressed_chunk and at one of the call sites. Do you need both? Is there a risk of constraints being added twice or metadata being added twice for the same chunk?

Copy link
Contributor

@erimatnor erimatnor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nits and a few things that needs to be fixed.

Overall, the changes look good, however.

I do wonder how we deal with upgrades.... in other words, what happens for people that upgrade their database and have compression already enabled but they don't have compressed chunks for some of their currently uncompressed chunks?

@@ -32,6 +32,10 @@ CREATE OR REPLACE FUNCTION _timescaledb_internal.create_compressed_chunk(
numrows_post_compression BIGINT
) RETURNS REGCLASS AS '@MODULE_PATHNAME@', 'ts_create_compressed_chunk' LANGUAGE C STRICT VOLATILE;

CREATE OR REPLACE FUNCTION _timescaledb_functions.create_compressed_chunks_for_hypertable(
hypertable REGCLASS
) RETURNS BOOL AS '$libdir/timescaledb-2.12.0-dev', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) RETURNS BOOL AS '$libdir/timescaledb-2.12.0-dev', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;
) RETURNS BOOL AS '@MODULE_PATHNAME@', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to expose this as a SQL function? I don't see it ever being called in the PR, although I might have missed it.

Is it for update scripts?

@@ -129,3 +129,6 @@ ALTER FUNCTION _timescaledb_internal.get_chunk_colstats(regclass) SET SCHEMA _ti

UPDATE _timescaledb_catalog.hypertable SET chunk_sizing_func_schema = '_timescaledb_functions' WHERE chunk_sizing_func_schema = '_timescaledb_internal' AND chunk_sizing_func_name = 'calculate_chunk_interval';

CREATE OR REPLACE FUNCTION _timescaledb_functions.create_compressed_chunks_for_hypertable(
hypertable REGCLASS
) RETURNS BOOL AS '$libdir/timescaledb-2.12.0-dev', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
) RETURNS BOOL AS '$libdir/timescaledb-2.12.0-dev', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;
) RETURNS BOOL AS '@MODULE_PATHNAME@', 'ts_create_compressed_chunks_for_hypertable' LANGUAGE C STRICT VOLATILE;


ts_cache_release(hcache);

PG_RETURN_BOOL(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this function should return VOID, given that it now always return TRUE and I don't see what FALSE would mean. Elog is raised in case of error.

If we'd like a return value, then maybe we should return either a count of the number of chunks created, or a list of chunks created (OIDs or similar).

ts_hypertable_permissions_check(table_relid, GetUserId());
hypertable = ts_hypertable_cache_get_cache_and_entry(table_relid, CACHE_FLAG_NONE, &hcache);

if (!ts_hypertable_has_compression_table(hypertable))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to better match the error, although I don't know what the difference is or why we have two checks.

Suggested change
if (!ts_hypertable_has_compression_table(hypertable))
if (!TS_HYPERTABLE_HAS_COMPRESSION_ENABLED(hypertable))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one checks if there is an actual compression table created, the second one just check the compression state. In this case I think that using TS_HYPERTABLE_HAS_COMPRESSION_ENABLED is sufficient, but if you want to decide if the compressed table should be dropped (which it seems the other cases are for), it makes sense to use ts_hypertable_has_compression_table.

* as part of enabling compression
*/
ListCell *lc;
List *chunk_id_list = ts_chunk_get_chunk_ids_by_hypertable_id(hypertable->fd.id);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably try to understand the behavior here when you have, e.g., thousands of chunks. Sometimes we we test only on a small number of chunks or metadata but the real world behavior is a problem when facing a lot of data (or chunks). We've seen users with thousands of chunks.

It might not be a problem at all, but it would be good to make sure.

We should check:

  • Is it slow
  • Does it use a lot of memory? E.g., is the memory usage in this loop proportional to number of chunks? Is that a problem with many chunks?

src/chunk.c Outdated
@@ -1210,13 +1212,62 @@ chunk_create_from_hypercube_after_lock(const Hypertable *ht, Hypercube *cube,
prefix,
get_next_chunk_id());

/* If we have compressoin enabled, we create the compressed chunk here
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* If we have compressoin enabled, we create the compressed chunk here
/* If we have compression enabled, we create the compressed chunk here

src/chunk.c Outdated
Comment on lines 1245 to 1250
if (namelen >= NAMEDATALEN)
ereport(ERROR,
(errcode(ERRCODE_INTERNAL_ERROR),
errmsg("invalid name \"%s\" for compressed chunk",
NameStr(*compress_chunk_table_name)),
errdetail("The associated table prefix is too long.")));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not clear why we have this error. First, I don't think that we will ever hit it. Second, since this is not a user-defined name, we should generate a name that is not too long. Truncating the name could be OK, as long as it is unique.

To summarize, I don't think we should ever throw an error because we cannot pick a name for the new chunk. We care only about not conflicting with existing names.

IMO, this should work like PostgreSQL's ChooseRelationName(), which picks a unique name without error. In fact, why don't we use that function?

Comment on lines +3606 to +3674
CHUNK_STATUS_COMPRESSED | CHUNK_STATUS_COMPRESSED_UNORDERED |
CHUNK_STATUS_COMPRESSED_PARTIAL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should just make a CHUNK_STATUS_ALL to cover all flags? E.g., 0xFFFFFFFF. Otherwise we need to update the "clear all" function every time we add a new flag.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would leave this out of scope of this PR, there is potential to remove one of these statuses so when we get to that, we can deal with this.

src/chunk.c Outdated
Comment on lines 1516 to 1585
/* If we have a compressed hypertable set, lock up the compressed hypertable
* so we can create the compressed chunk too
*/
if (ht->fd.compressed_hypertable_id != 0)
{
compress_ht = ts_hypertable_get_by_id(ht->fd.compressed_hypertable_id);
Assert(compress_ht);
LockRelationOid(compress_ht->main_table_relid, ShareUpdateExclusiveLock);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I guess this if for new chunks that didn't exist when compression was enabled.

I do wonder, however, if we can delay grabbing the ShareUpdateExclusiveLock on the compressed hypertable until we know we actually needed the lock on the main hypertable. See check below for concurrent creation by other backend. In other words, if someone else created the chunk before we grabbed the lock, we need not take the lock on the compressed hypertable at all.

Copy link
Contributor

@mkindahl mkindahl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few more comments, most of them minor, and one request.

This this is a change that potentially can have very different performance impact for different users, it makes sense add a flag allowing the feature to be disabled (and by default enable it). This will allow us to handle differences between users by disabling the feature if necessary.

This was very convenient in #5902 which caused a performance regression in a rare case.

ts_hypertable_permissions_check(table_relid, GetUserId());
hypertable = ts_hypertable_cache_get_cache_and_entry(table_relid, CACHE_FLAG_NONE, &hcache);

if (!ts_hypertable_has_compression_table(hypertable))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first one checks if there is an actual compression table created, the second one just check the compression state. In this case I think that using TS_HYPERTABLE_HAS_COMPRESSION_ENABLED is sufficient, but if you want to decide if the compressed table should be dropped (which it seems the other cases are for), it makes sense to use ts_hypertable_has_compression_table.

@@ -1443,6 +1443,7 @@ decompress_chunk(Oid in_table, Oid out_table)
FreeBulkInsertState(decompressor.bistate);
MemoryContextDelete(decompressor.per_compressed_row_ctx);
ts_catalog_close_indexes(decompressor.indexstate);
truncate_relation(in_table);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a bug, or is it a result of changes in this PR? If it was a bug, you might want to have this as a separate PR so that it ends up in the changelog and is visible to users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not a bug, previously decompressing a chunk would drop it at a later point. Now we truncate it instead of dropping. I could move this to the place where we used to drop it just for clarity.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No reason to optimize the code for the reviewer, you can keep it where it is. It just looked like it was a bug.

Comment on lines 1045 to 1051
/* Need to drop the old compressed hypertable in case the segment by columns
* changed (and thus the column types of compressed hypertable need to change)
*
* We need to cascade the delete since chunks are now not removed during
* decompression.
* */
ts_hypertable_drop(compressed, DROP_CASCADE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but this will cascade to other objects (potentially user-defined) that are dependent on the compressed table. A better option would be to collect all the objects you want to delete and use performMultipleDeletions, which will allow you to delete multiple objects with potential dependencies in one go.

It also avoid the warning that is currently printed, which is only going to be confusing to users.

Check chunk_collect_objects_for_deletion for an approach that avoid having to update the list of dependent objects all the time.

Comment on lines 194 to 195
permutation "s1_compress_one" "s2_compress_one" "s2_commit" "s1_commit"
permutation "s2_compress_one" "s1_compress_one" "s1_commit" "s2_commit"
Copy link
Contributor

@mkindahl mkindahl Aug 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the sessions are symmetric, I think it is more meaningful to test these two:

Suggested change
permutation "s1_compress_one" "s2_compress_one" "s2_commit" "s1_commit"
permutation "s2_compress_one" "s1_compress_one" "s1_commit" "s2_commit"
permutation "s1_compress_one" "s2_compress_one" "s2_commit" "s1_commit"
permutation "s2_compress_one" "s1_compress_one" "s2_commit" "s1_commit"

SELECT _timescaledb_internal.test_merge_chunks_on_dimension('_timescaledb_internal._hyper_1_1_chunk','_timescaledb_internal._hyper_3_7_chunk', 1);
SELECT _timescaledb_internal.test_merge_chunks_on_dimension('_timescaledb_internal._hyper_1_1_chunk','_timescaledb_internal._hyper_3_13_chunk', 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems odd that these need to be updated each time we change anything in the code. Can we make this generic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think its every time we change anything in the code, we have to use multiple specific chunks to test this functionality and generalizing them would be more work than the actual worth.

Changes here are infrequent enough to absorb the eventual cost of patching the test with new chunk names.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test only seems to test that the create_compressed_chunks_for_hypertable works, which makes sense if we add this to the API. If it is only an internal function, then there is no need to add a test for this particular function.

Is the intention that this should be a user-visible function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating to add upgrade/downgrade paths which will use this function.

c.hypertable_id = ht.id and ht.table_name = 'target' and c.compressed_chunk_id IS NULL;
c.hypertable_id = ht.id and ht.table_name = 'target' and c.status & 1 = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the concept of "is a chunk that can can be compressed" has changed (which seems to be the reason to why you changed all these expressions) it might make sense to add a function for this purpose. That would avoid having to change these expressions each time we change the notion of what it means that a chunk is "compressable".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That check is redundant in most cases anyways, like here where its all new uncompressed chunks. I'd rather opt for removing it altogether.

@antekresic antekresic force-pushed the create-compressed-chunks branch 6 times, most recently from e806996 to e35d68a Compare September 4, 2023 07:13
To avoid lock contention during compression operation, we are moving
compressed chunk creation together with uncompressed chunk creation
during insert time. Now compressed chunks live while the uncompressed
chunk is alive, we don't remove them during decompression but rather
truncate them. This moves lock contention over compressed hypertable
to coincide with lock contention over uncompressed hypertable.
@antekresic antekresic force-pushed the create-compressed-chunks branch from 250d59c to 781b39f Compare September 4, 2023 10:41
@@ -1486,6 +1608,9 @@ ts_chunk_create_for_point(const Hypertable *ht, const Point *p, bool *found, con
return chunk;
}

/* Lock compressed hypertable before resurrecting */
if (compress_ht)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using compress_ to mean compressed_ is slightly confusing, let's use compressed_ everywhere?

@antekresic
Copy link
Contributor Author

This one is going be superseded by the per chunk compression settings change that will be implemented in the near future.

@antekresic antekresic closed this Dec 18, 2023
@antekresic antekresic deleted the create-compressed-chunks branch December 18, 2023 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants