Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load/Store, masked set and counting operations #2430

Conversation

mazimkhan
Copy link
Contributor

Introduces:

  • variants of load and store operations including masked variants (MaskedLoadU, LoadHigher, StoreTruncated)
  • Counting functions to find information about the data in each lane of a vector (MaskedLeadingZeroCountOrZero, AllOnes, AllZeros)
  • Masked vector instantiation operations (SetOr, SetOrZero).

"OrZero" operations will return zero where the mask is false whereas standard masking returns the corresponding lane of a passed vector.

All introduced operations are implemented in generic_ops-inl.h and in arm_sve-inl.h where there is a performance gain to be made. Testing is also performed for all new operations.

Copy link

google-cla bot commented Jan 6, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
hwy/ops/arm_sve-inl.h Outdated Show resolved Hide resolved
hwy/ops/generic_ops-inl.h Outdated Show resolved Hide resolved
hwy/ops/generic_ops-inl.h Outdated Show resolved Hide resolved
mazimkhan and others added 2 commits January 30, 2025 11:33
Rename SetOr* ops for consistency
Rename AllOnes/AllZeros to AllBits1/0
Remove MaskedLoadU, this is covered by MaskedLoad
Rename LowerHigher to InsertIntoUpper
Rework StoreTruncated, rename to TruncateStore
Rename macro arg
Avoid full-length load in LoadHigher
Optimise AllBits1
@wbb-ccl wbb-ccl force-pushed the cc_up_set_load_store_count_operations branch from c1f7768 to 6b90d90 Compare January 30, 2025 15:19
g3doc/quick_reference.md Outdated Show resolved Hide resolved
g3doc/quick_reference.md Outdated Show resolved Hide resolved
@copybara-service copybara-service bot merged commit e96d4d3 into google:master Jan 30, 2025
39 of 40 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants