Skip to content

Parallel Reductions

Matt Norman edited this page Apr 29, 2022 · 2 revisions

YAKL provides convenient parallel reductions via routines that mimic the Fortran intrinsics: sum(), minval(), maxval(), and product(). To perform an efficient parallel reduction on an Array, SArray, or FSArray object using vendor libraries, simply use:

  • yakl::intrinsics::sum(array);
  • yakl::intrinsics::minval(array);
  • yakl::intrinsics::maxval(array);
  • yakl::intrinsics::product(array);

If performed on an Array object, the reduction launches a kernel and therefore cannot be called from inside another kernel. If performed on an SArray or FSArray object, this can be done within a kernel or on the host.

Often, in OpenACC, OpenMP offload, and Kokkos, code will compute a reduction in the same loops as other computations. With YAKL, however, the M.O. is to store to an intermediate array and then perform a separate reduction on that array in another kernel.

Clone this wiki locally