-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Various masked operations #2428
base: master
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
g3doc/quick_reference.md
Outdated
@@ -1050,6 +1050,9 @@ types, and on SVE/RVV. | |||
|
|||
* <code>V **AndNot**(V a, V b)</code>: returns `~a[i] & b[i]`. | |||
|
|||
* <code>V **MaskedOrOrZero**(M m, V a, V b)</code>: returns `a[i] || b[i]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a different naming convention here which might be a bit more natural?
There is also a MaskedLoad which returns 0 as the default, as opposed to MaskedLoadOr, which has the explicit default value. If we apply that here, we can just call it MaskedOr(m, a b), what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That works great.
g3doc/quick_reference.md
Outdated
@@ -1050,6 +1050,9 @@ types, and on SVE/RVV. | |||
|
|||
* <code>V **AndNot**(V a, V b)</code>: returns `~a[i] & b[i]`. | |||
|
|||
* <code>V **MaskedOrOrZero**(M m, V a, V b)</code>: returns `a[i] || b[i]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we mean a[i] | b[i]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do :)
g3doc/quick_reference.md
Outdated
@@ -2237,6 +2240,22 @@ The following `ReverseN` must not be called if `Lanes(D()) < N`: | |||
must be in the range `[0, 2 * Lanes(d))` but need not be unique. The index | |||
type `TI` must be an integer of the same size as `TFromD<D>`. | |||
|
|||
* <code>V **TableLookupLanesOr**(M m, V a, V b, unspecified)</code> returns the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we don't yet have an optimized version of these op, and it's just a convenience wrapper over IfThenElse. Would it be an option to move this into a utility function within your codebase? It's not clear whether this provides enough value to be a documented op that all readers must know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've removed these for now, we'll add them in a future PR when we have optimised versions.
hwy/ops/arm_sve-inl.h
Outdated
#define HWY_NATIVE_MASKED_REDUCE_SCALAR | ||
#endif | ||
|
||
template <class D, class M> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a TODO here that we can remove the SumOfLanesM in favor of using MaskedReduceSum directly. This entails adding the D arg to HWY_SVE_REDUCE_ADD
as done in HWY_SVE_FIRSTN
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have done.
hwy/ops/arm_sve-inl.h
Outdated
@@ -219,6 +219,15 @@ HWY_SVE_FOREACH_BF16_UNCONDITIONAL(HWY_SPECIALIZE, _, _) | |||
HWY_API HWY_SVE_V(BASE, BITS) NAME(HWY_SVE_V(BASE, BITS) v) { \ | |||
return sv##OP##_##CHAR##BITS(v); \ | |||
} | |||
#define HWY_SVE_RETV_ARGMV_M(BASE, CHAR, BITS, HALF, NAME, OP) \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: we have the naming convention P for predicate, for example in HWY_SVE_RETV_ARGPVV
. I'm fine with either P or M, but let's please be consistent, feel free to pick one.
This might actually replace the existing HWY_SVE_RETV_ARGPV
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell, it looks like P has been used previously where the intrinsic takes a predicate but this is fixed as a true mask and M has been used where a user-specified mask is passed as an argument.
hwy/ops/generic_ops-inl.h
Outdated
} | ||
template <class D, class M> | ||
HWY_API TFromD<D> MaskedReduceMin(D d, M m, VFromD<D> v) { | ||
return ReduceMin(d, IfThenElse(m, v, MaxOfLanes(d, v))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems unnecessarily expensive, how about we replace MaxOfLanes with Set(d, hwy::HighestValue)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've fixed that now.
hwy/ops/generic_ops-inl.h
Outdated
} | ||
template <class D, class M> | ||
HWY_API TFromD<D> MaskedReduceMax(D d, M m, VFromD<D> v) { | ||
return ReduceMax(d, IfThenElseZero(m, v)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can get into trouble for signed values. If all values are negative, the presence of mask=false elements changes the result. Can similarly use hwy::LowestValue here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I've fixed that now.
Remove OrZero suffix and fix MaskedOr docs Update naming of masked table lookups to follow convention Optimise MaskedReduceMin/Max Add TODOs Remove the masked table lookups To be added alongside the platform specialisations Remove unused macros Rename HWY_SVE_RETV_ARGMVVZ to follow convention
504b82d
to
d76ec00
Compare
Introduces:
a[i] || b[i]
orzero
ifm[i]
is false.TwoTablesLookupLanes(V a, V b, unspecified)
wherem[i]
is true, anda[i]
wherem[i]
is false.TwoTablesLookupLanes(V a, V b, unspecified)
wherem[i]
is true, and zero wherem[i]
is false.m[i]
istrue
.m[i]
istrue
.m[i]
istrue
.mask[i] < 0 ? (-v[i]) : ((mask[i] > 0) ? v[i] : impl_defined_val)
, whereimpl_defined_val
is an implementation-defined value that is equal to either 0 orv[i]
. SVE included only.Testing is performed for all new operations.