-
Notifications
You must be signed in to change notification settings - Fork 896
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vectorize text equality and LIKE #6189
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6189 +/- ##
==========================================
+ Coverage 80.06% 80.95% +0.88%
==========================================
Files 190 193 +3
Lines 37181 36664 -517
Branches 9450 9583 +133
==========================================
- Hits 29770 29680 -90
- Misses 2997 3168 +171
+ Partials 4414 3816 -598 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -47,7 +47,7 @@ vector_array_predicate_impl(VectorPredicate *vector_const_predicate, bool is_or, | |||
char typalign; | |||
get_typlenbyvalalign(ARR_ELEMTYPE(arr), &typlen, &typbyval, &typalign); | |||
|
|||
const char *array_data = (const char *) ARR_DATA_PTR(arr); | |||
const char *restrict array_data = (const char *) ARR_DATA_PTR(arr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment on which compiler optimizations we are getting from the restrict
type qualifier?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just add restrict
to every piece of data that is used in hot loops and lives on the heap. The general idea is to tell the compiler that they point to different chunks of memory, so that more efficient code can be generated. It's important for the data you write, otherwise technically everything has to be re-read for a new loop iteration, because it might have been modified by a write. I think it's not really important for read-only variables, but I find it simpler to stick it on everything to avoid the need of deeper analysis which can be error-prone. So this is the general idea, I think spelling this out in the code comments each time would be too verbose and not very useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's not really important for read-only variables, but I find it simpler to stick it on everything to avoid the need of deeper analysis which can be error-prone.
Maybe I should use const
for read-only objects and restrict
for read/write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static ArrowArray * | ||
make_single_value_arrow(Oid pgtype, Datum datum, bool isnull) | ||
make_single_value_arrow_arithmetic(Oid arithmetic_type, Datum datum, bool isnull) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment about the high-level difference between make_single_value_arrow_arithmetic
and make_single_value_arrow_text
? It might help understand the code better without comparing the details for the functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are different just because one creates arrow array for arithmetic types, and the other for text types, and the layouts of these arrays are different. I can write "this function creates arrow array for arithmetic types", but this is already clear from the name, so tell me if you have better ideas for a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if you add a comment and just mention that the array layouts are different would be enough for the reader to know at this point.
{ \ | ||
(p)++; \ | ||
(plen)--; \ | ||
} while ((plen) > 0 && (*(p) &0xC0) == 0x80) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a comment on why the *(p) &0xC0) == 0x80
part is needed and what the 0x80
constant represents?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This follows the PG code as well, I'll add more details to the comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's just skipping the UTF8-encoded characters, and this check is checking for the UTF8 prefix. I try to avoid adding comments to the copied PG code, because it makes comparing with the original more difficult.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is true. I compared it with the PG code and my function looked a bit different. So, I was not sure if you modified it or if this is upstream code from another version. If we have a comment about the PG version of the function, this should be fine.
This PR adds vectorized computation of text equality for deterministic collations, and case-sensitive LIKE for UTF-8 database encoding.
Benchmarking shows up to 3x speedups: https://grafana.ops.savannah-dev.timescale.com/d/fasYic_4z/compare-akuzm?orgId=1&var-branch=All&var-run1=3171&var-run2=3178&var-threshold=0&var-use_historical_thresholds=true&var-threshold_expression=2.5%20%2A%20percentile_cont%280.90%29&var-exact_suite_version=false
Prerequisites:
Disable-check: force-changelog-file