-
Notifications
You must be signed in to change notification settings - Fork 15
YAKL_INLINE and Calling Class Methods from Kernels
The YAKL_INLINE
prefix should be applied to all functions you plan to call from inside a parallel_for
.
YAKL_INLINE float sum(float x, float y) { return x+y; }
parallel_for( nx , YAKL_LAMBDA (int i) { c(i) = sum( a(i) , b(i) ); });
If you want to create a function that only runs on the device, you can use the YAKL_DEVICE_INLINE
prefix.
If you have a class method prefixed with YAKL_INLINE
, meaning you intend to potentially call it from a parallel_for
kenel, you must delcare it as static
. This is not a firm requirement in CUDA, but it is in HIP and likely SYCL as well. Also, it makes sense from a C++ perspective. static
member functions belong to the class itself, not any particular object of that class. Therefore, it does not use the this->
pointer. This idea is moot if you plan on inlining the function; however, strictly speaking, you shouln't open yourself up to the possibility of calling a function via the this->
pointer. Therefore the function should be static. If, instead, you choose to rely on the C++17 feature that passes *this
to a lambda, please understand that this might not be portable to all hardware backends.