Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge new find_with_marked intp perf v2 #7385

Merged

Conversation

nicola-cab
Copy link
Member

No description provided.

Copy link
Contributor

@finnschiermer finnschiermer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also faster on my machine, feel free to merge.

@@ -791,7 +790,11 @@ constexpr uint32_t inverse_width[65] = {

inline int first_field_marked(int width, uint64_t vector)
{
#if REALM_WINDOWS
int lz = (int)_tzcnt_u64(vector); // TODO: not clear if this is ok on all platforms
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@finnschiermer this is a tmp fix, just to please the builders.

@nicola-cab
Copy link
Member Author

Some small bug to address in object store for frozen objects, but we can merge this and I will fix it later. Perf wise we are closer to master (worst path ~+50% , best path -20%), but we are not using a valid data set and neither we are implementing any heuristic.

@nicola-cab nicola-cab merged commit e84499d into nc/perf_work_v2_int_array Feb 27, 2024
4 of 17 checks passed
@nicola-cab nicola-cab deleted the fsa/experimental-find-first-optimization branch February 27, 2024 19:10
nicola-cab added a commit that referenced this pull request Mar 1, 2024
* idea: subword parallel search

* better subword search

* better naming

* new methods for reading unaligned word from array of bitfields

* perf work on array with find based on parallel values comparison

* major cleanup of bitfield scanning

* de-templatified bit field search

* more tests and code generalization

* more tests

* new iterator optimized for linear scan

* eliminated last use of templates in subword parallel search

* optimization of some subword search methods

* working EQ cmp with parallel subword check

* fix in all_fields_NE

* make populate handle negative values

* commented out bypass which disabled subword search

* fix in fix of populate()

* bugfix and direct methods for signed GT and GE

* fix for GT condition

* enabled array perf tests (outside debug mode)

* fixed inner search loop

* made some perf tests non concurrent and silenced warnings

* moved call to match() into inner loop in subword parallel search

* Perf v2, find_with_marked for packed interger arrays (#7385)

* made find_first_marked() branch free

* various optimizations of find_first_marked, best one selected

* for some reason this is much bettergit add .

* no warnings

* made search method selection more explicit and clear

* bunch of fixes..

* restore subword loop

* fix object store tests + use subword cmp always (which is faster on my machine)

---------

Co-authored-by: Finn Schiermer Andersen <[email protected]>

* Perf work for array flex (still missing timestamps) (#7397)

* WIP perf work for array flex

* more small stuff, nothing important

* parallel subword for eq and neq

* move find parallel inside loop for eq and neq

* LT parallel subword cmp

* GT find for array flex

* Int equality as good as Packed

* code review

---------

Co-authored-by: Finn Schiermer Andersen <[email protected]>
Co-authored-by: Finn Schiermer Andersen <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants