Implement vectorized filters #5915

akuzm · 2023-08-01T09:51:57Z

Start with supporting "vector ? const" predicates for several arithmetic type pairs.

Disable-check: force-changelog-file

github-actions · 2023-08-01T11:28:03Z

@erimatnor, @gayyappan: please review this pull request.

Powered by pull-review

tsl/src/nodes/decompress_chunk/pred_vector_const_numeric_all.c

svenklemm · 2023-08-15T11:28:26Z

Since this is all about integer expressions, the numeric naming might be a bit misleading since there is also a numeric datatype for which none of this applies.

akuzm · 2023-08-15T11:53:48Z

Since this is all about integer expressions, the numeric naming might be a bit misleading since there is also a numeric datatype for which none of this applies.

Floats as well. What options do we have, "numerical" or maybe "arithmetic"? The postgres numeric probably won't be vectorized because its data structure is not practical (varlena header, another header, two representations, several special values, and it's stored as decimal digits). In other computation engines they use fixed-point, fixed-width decimals stored as binary, which are easily vectorized.

svenklemm · 2023-08-23T11:00:06Z

I'd like to see some more tests (even if we dont initially support them)

queries without aggregation
constraints on columns not selected
constraints on multiple columns
ORed constraints
ANDed constraints

jnidzwetzki · 2023-08-30T08:46:56Z

@akuzm Do we have some benchmark data for this PR?

konskov · 2023-08-30T09:09:32Z

tsl/src/nodes/decompress_chunk/planner.c

+		if (OidIsValid(commutator_opno))
+		{
+			o->opno = commutator_opno;
+			o->opfuncid = InvalidOid;


why does this need to be InvalidOid instead of get_opcode(commutator_opno);

I am also not sure about the opfuncid. However, in CommuteOpExpr the opfuncid is also set to InvalidOid.

Right, I think there's no particular reason, just repeating the CommuteOpExpr. I added a comment about this.

tsl/src/nodes/decompress_chunk/compressed_batch.c

src/guc.c

akuzm · 2023-08-30T13:58:14Z

@akuzm Do we have some benchmark data for this PR?

I'll rerun it, because there were some changes in main and now the comparison looks weird.

codecov · 2023-08-30T14:56:17Z

Codecov Report

Merging #5915 (c066d3f) into main (bd9d09e) will increase coverage by 0.02%.
The diff coverage is 87.66%.

@@            Coverage Diff             @@
##             main    #5915      +/-   ##
==========================================
+ Coverage   81.34%   81.37%   +0.02%     
==========================================
  Files         243      246       +3     
  Lines       55971    56193     +222     
  Branches    12395    12457      +62     
==========================================
+ Hits        45532    45726     +194     
  Misses       8095     8095              
- Partials     2344     2372      +28

Files Changed	Coverage Δ
tsl/src/import/ts_explain.c	`78.57% <78.57%> (ø)`
tsl/src/nodes/decompress_chunk/vector_predicates.c	`80.00% <80.00%> (ø)`
tsl/src/nodes/decompress_chunk/compressed_batch.c	`89.96% <83.67%> (-2.61%)`	⬇️
tsl/src/nodes/decompress_chunk/exec.c	`94.16% <90.00%> (+0.25%)`	⬆️
tsl/src/nodes/decompress_chunk/planner.c	`87.46% <94.64%> (+1.74%)`	⬆️
...mpress_chunk/pred_vector_const_arithmetic_single.c	`96.42% <96.42%> (ø)`
src/guc.c	`96.66% <100.00%> (+0.03%)`	⬆️
tsl/src/nodes/decompress_chunk/batch_array.c	`92.98% <100.00%> (+0.12%)`	⬆️

... and 41 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

akuzm · 2023-08-30T19:42:15Z

@akuzm Do we have some benchmark data for this PR?

I'll rerun it, because there were some changes in main and now the comparison looks weird.

https://grafana.ops.savannah-dev.timescale.com/d/fasYic_4z/compare-akuzm?orgId=1&var-branch=All&var-run1=2683&var-run2=2684&var-threshold=0.05&var-use_historical_thresholds=false&chunkNotFound

There's 10-70% speedup on some analytical queries. The huge regression seems to be a flaky parallel plan.

akuzm · 2023-08-31T16:36:10Z

I'd like to see some more tests (even if we dont initially support them)

queries without aggregation

constraints on columns not selected

constraints on multiple columns

ORed constraints

ANDed constraints

Added some simple tests for these cases to decompress_vector_qual, although I think they are already mostly present in other files.

jnidzwetzki · 2023-09-01T12:22:25Z

tsl/src/nodes/decompress_chunk/pred_vector_const_arithmetic_all.c

+
+#include "pred_vector_const_arithmetic_type_pair.c"
+
+/* int4. functions. */


Nit:

Suggested change

/* int4. functions. */

/* int4... functions. */

I removed all these semicolons.

jnidzwetzki · 2023-09-01T12:41:32Z

tsl/src/nodes/decompress_chunk/planner.c

+
+	OpExpr *o = castNode(OpExpr, qual);
+
+	if (list_length(o->args) != 2)


Could we have a test case for this case?

Added a case with unary operator.

jnidzwetzki

Added a few minor comments, but overall it looks good.

svenklemm · 2023-09-11T13:05:33Z

tsl/test/sql/decompress_vector_qual.sql

@@ -0,0 +1,68 @@
+-- This file and its contents are licensed under the Timescale License.


can we have explain output here currently it is not visible from test output whether vectorized filters are actually used here. also can we have some tests with NULL e.g. col < NULL, col <> NULL, col IS NULL, ...

EXPLAIN would require versioned references, to avoid this I use the guc debug_require_vector_qual to check that the vectorized quals are used or not.

NullTest is not vectorized currently, I added cases for is null and is not null.

For null values in the column itself, I have tests with the metric4 column.

Not sure how to test col < NULL because all the functions we vectorize are strict, and we don't vectorize the initplan parameters yet.

konskov · 2023-09-12T12:13:35Z

src/guc.c

@@ -725,6 +736,22 @@ _guc_init(void)
 							   /* assign_hook= */ NULL,
 							   /* show_hook= */ NULL);

+	DefineCustomEnumVariable(/* name= */ "timescaledb.debug_require_vector_qual",
+							 /* short_desc= */
+							 "ensure that non-vectorized filters are used in DecompressChunk node",


should this be vectorized instead of non-vectorized?

It does both, depending on the setting. I'll make it say vectorized or non-vectorized.

Start with supporting "vector ? const" predicates for several arithmetic type pairs.

github-actions bot assigned akuzm Aug 1, 2023

akuzm force-pushed the vectorized-filters branch from 182d44b to ca60216 Compare August 1, 2023 09:53

akuzm marked this pull request as ready for review August 1, 2023 11:27

github-actions bot requested review from erimatnor and gayyappan August 1, 2023 11:28

svenklemm reviewed Aug 15, 2023

View reviewed changes

tsl/src/nodes/decompress_chunk/pred_vector_const_numeric_all.c Outdated Show resolved Hide resolved

konskov self-requested a review August 29, 2023 13:24

konskov reviewed Aug 30, 2023

View reviewed changes

jnidzwetzki reviewed Aug 30, 2023

View reviewed changes

tsl/src/nodes/decompress_chunk/compressed_batch.c Outdated Show resolved Hide resolved

jnidzwetzki reviewed Aug 30, 2023

View reviewed changes

tsl/src/nodes/decompress_chunk/compressed_batch.c Outdated Show resolved Hide resolved

jnidzwetzki reviewed Aug 30, 2023

View reviewed changes

src/guc.c Show resolved Hide resolved

akuzm mentioned this pull request Aug 31, 2023

experiments with vectorized filters #5810

Closed

5 tasks

jnidzwetzki reviewed Sep 1, 2023

View reviewed changes

jnidzwetzki approved these changes Sep 1, 2023

View reviewed changes

svenklemm reviewed Sep 11, 2023

View reviewed changes

svenklemm added this to the TimescaleDB 2.12 milestone Sep 12, 2023

konskov reviewed Sep 12, 2023

View reviewed changes

svenklemm approved these changes Sep 12, 2023

View reviewed changes

Implement vectorized filters

c066d3f

Start with supporting "vector ? const" predicates for several arithmetic type pairs.

akuzm force-pushed the vectorized-filters branch from 6ef942d to c066d3f Compare September 13, 2023 20:04

akuzm enabled auto-merge (rebase) September 13, 2023 20:06

akuzm merged commit 23b51c9 into timescale:main Sep 13, 2023
34 checks passed

akuzm deleted the vectorized-filters branch September 13, 2023 21:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement vectorized filters #5915

Implement vectorized filters #5915

akuzm commented Aug 1, 2023 •

edited

Loading

github-actions bot commented Aug 1, 2023

svenklemm commented Aug 15, 2023

akuzm commented Aug 15, 2023

svenklemm commented Aug 23, 2023

jnidzwetzki commented Aug 30, 2023

konskov Aug 30, 2023

jnidzwetzki Sep 1, 2023

akuzm Sep 1, 2023

akuzm commented Aug 30, 2023

codecov bot commented Aug 30, 2023 •

edited

Loading

akuzm commented Aug 30, 2023

akuzm commented Aug 31, 2023 •

edited

Loading

jnidzwetzki Sep 1, 2023

akuzm Sep 1, 2023

jnidzwetzki Sep 1, 2023

akuzm Sep 1, 2023

jnidzwetzki left a comment

svenklemm Sep 11, 2023

akuzm Sep 11, 2023

konskov Sep 12, 2023

akuzm Sep 12, 2023


		#include "pred_vector_const_arithmetic_type_pair.c"

		/* int4. functions. */


		OpExpr *o = castNode(OpExpr, qual);

		if (list_length(o->args) != 2)

		@@ -0,0 +1,68 @@
		-- This file and its contents are licensed under the Timescale License.

Implement vectorized filters #5915

Implement vectorized filters #5915

Conversation

akuzm commented Aug 1, 2023 • edited Loading

github-actions bot commented Aug 1, 2023

svenklemm commented Aug 15, 2023

akuzm commented Aug 15, 2023

svenklemm commented Aug 23, 2023

jnidzwetzki commented Aug 30, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akuzm commented Aug 30, 2023

codecov bot commented Aug 30, 2023 • edited Loading

Codecov Report

akuzm commented Aug 30, 2023

akuzm commented Aug 31, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnidzwetzki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

akuzm commented Aug 1, 2023 •

edited

Loading

codecov bot commented Aug 30, 2023 •

edited

Loading

akuzm commented Aug 31, 2023 •

edited

Loading