Collect relation-level stats during compression #7520

erimatnor · 2024-12-05T16:54:22Z

During compression, column min/max stats are collected on a per-segment basis for orderby columns and those that have indexes.

This change uses the same mechanism to collect relation-level min/max stats to be used by chunk skipping. This avoids, in worst case, an extra full table scan to gather these chunk column stats.

For simplicity, stats gathering is enabled for all columns that can support it, even though a column might use neither segment-level stats nor relation-level (chunk column) stats. The overhead of collecting min/max values should be negligible.

Disable-check: force-changelog-file

During compression, column min/max stats are collected on a per-segment basis for orderby columns and those that have indexes. This change uses the same mechanism to collect relation-level min/max stats to be used by chunk skipping. This avoids, in worst case, an extra full table scan to gather these chunk column stats. For simplicity, stats gathering is enabled for all columns that can support it, even though a column might use neither segment-level stats nor relation-level (chunk column) stats. The overhead of collecting min/max values should be negligible.

codecov · 2024-12-05T17:20:27Z

Codecov Report

Attention: Patch coverage is 89.36170% with 10 lines in your changes missing coverage. Please review.

Project coverage is 82.17%. Comparing base (59f50f2) to head (25d61ff).
Report is 641 commits behind head on main.

Files with missing lines	Patch %	Lines
src/ts_catalog/chunk_column_stats.c	66.66%	1 Missing and 3 partials ⚠️
tsl/src/compression/compression.c	87.50%	0 Missing and 4 partials ⚠️
tsl/src/compression/api.c	75.00%	0 Missing and 1 partial ⚠️
tsl/src/compression/segment_meta.c	97.82%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #7520      +/-   ##
==========================================
+ Coverage   80.06%   82.17%   +2.11%     
==========================================
  Files         190      230      +40     
  Lines       37181    43183    +6002     
  Branches     9450    10854    +1404     
==========================================
+ Hits        29770    35487    +5717     
- Misses       2997     3369     +372     
+ Partials     4414     4327      -87

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

erimatnor force-pushed the compression-minmax-columnstats branch from 0200340 to 1111d62 Compare December 5, 2024 16:54

erimatnor added compression chunk-skipping labels Dec 5, 2024

erimatnor force-pushed the compression-minmax-columnstats branch 3 times, most recently from d31700e to 6e50217 Compare December 5, 2024 16:59

erimatnor force-pushed the compression-minmax-columnstats branch from 6e50217 to 25d61ff Compare December 5, 2024 17:01

fabriziomello assigned erimatnor Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Collect relation-level stats during compression #7520

Collect relation-level stats during compression #7520

erimatnor commented Dec 5, 2024 •

edited

Loading

codecov bot commented Dec 5, 2024

Collect relation-level stats during compression #7520

Are you sure you want to change the base?

Collect relation-level stats during compression #7520

Conversation

erimatnor commented Dec 5, 2024 • edited Loading

codecov bot commented Dec 5, 2024

Codecov Report

erimatnor commented Dec 5, 2024 •

edited

Loading