-
Notifications
You must be signed in to change notification settings - Fork 15
Parallel Reductions
Matt Norman edited this page Apr 29, 2022
·
2 revisions
YAKL provides convenient parallel reductions via routines that mimic the Fortran intrinsics: sum()
, minval()
, maxval()
, and product()
. To perform an efficient parallel reduction on an Array
, SArray
, or FSArray
object using vendor libraries, simply use:
yakl::intrinsics::sum(array);
yakl::intrinsics::minval(array);
yakl::intrinsics::maxval(array);
yakl::intrinsics::product(array);
If performed on an Array
object, the reduction launches a kernel and therefore cannot be called from inside another kernel. If performed on an SArray
or FSArray
object, this can be done within a kernel or on the host.
Often, in OpenACC, OpenMP offload, and Kokkos, code will compute a reduction in the same loops as other computations. With YAKL, however, the M.O. is to store to an intermediate array and then perform a separate reduction on that array in another kernel.