-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplified user interface for tensorOps functions #227
Comments
I think this is cleaner, it doesn't require detection and only uses one extra function instead of two. I'm not convinced though. If there was a way to do it without requiring the extra function to reorder IE for matrix multiplication template< int ISIZE=Size0< DST_MATRIX >, int JSIZE=Size1< DST_MATRIX >, int KSIZE=Size2< MATRIX_A >, ... >
void Rij_eq_AikBkj( DST_MATRIX && dstMatrix, MATRIX_A const & matrixA, MATRIX_B const & matrixB ); you only get anything if |
@corbett5 Odd...I couldn't get the overload to work when I tried it....but now it works fine. Must be my cognitive decline coming into play. I am looking at this from a usage perspective. It is nice to not have to specify the sizes if they can be deduced. In the case of an ArraySlice/ArrayView/C-array/R1Tensor, they can be deduced. The cost is that we have to:
Is this a fair set of statements? |
ArraySlice/ArrayView cannot be deduced at compile time. |
Duh. I guess we would have to wait until we have compile-time dims added. |
Did you ever put together a proposal for compile time dims, or are we waiting for RAJA to do it first? |
No, I could do it but it would take a fair bit of time. You could do it in steps though, the first of which would be to get rid of the strides. |
If you keep the strides you get the advantage of deducing and not-having to resize and it would be much quicker. But the indexing math would be unchanged. |
I still think it is worth doing and just adding to the deducable types when they become available. I just really dislike having to specify the dims for something that is fixed dim at compile time and can be deduced. |
The tensor ops functions typically require a single/combination of integer arguments to set the bounds for the function. Take for example:
The call looks something like:
where the
N
(3
) is always required, and the typeVECTOR
is deduced. This is a little bit clunky imo. We could do something where we deduced both a size and type. Like so:Here it is in a compiler explorer:
https://godbolt.org/z/K1P4T9qdE
If we were to standardize the way we return compile time sizes, this works fairly well. With a raw pointer you would still require the specification of
N
. With multi-dimensions this would be more complicated...but I don't think it is too prohibitive, and the usability is certainly nicer.@corbett5 @klevzoff Thoughts?
The text was updated successfully, but these errors were encountered: