Skip to content

YAKL_INLINE and Calling Class Methods from Kernels

Matt Norman edited this page Dec 11, 2021 · 1 revision

The YAKL_INLINE prefix should be applied to all functions you plan to call from inside a parallel_for.

YAKL_INLINE float sum(float x, float y) { return x+y; }
parallel_for( nx , YAKL_LAMBDA (int i) { c(i) = sum( a(i) , b(i) ); });

If you want to create a function that only runs on the device, you can use the YAKL_DEVICE_INLINE prefix.

Calling class methods from parallel_for kernels

If you have a class method prefixed with YAKL_INLINE, meaning you intend to potentially call it from a parallel_for kenel, you must delcare it as static. This is not a firm requirement in CUDA, but it is in HIP and likely SYCL as well. Also, it makes sense from a C++ perspective. static member functions belong to the class itself, not any particular object of that class. Therefore, it does not use the this-> pointer. This idea is moot if you plan on inlining the function; however, strictly speaking, you shouln't open yourself up to the possibility of calling a function via the this-> pointer. Therefore the function should be static. If, instead, you choose to rely on the C++17 feature that passes *this to a lambda, please understand that this might not be portable to all hardware backends.

Clone this wiki locally