Home

YAKL (Yet Another Kernel Launcher)

A Simple C++ Framework for Performance Portability and Fortran Code Porting

Author: Matt Norman (Oak Ridge National Laboratory) - mrnorman.github.io

Contributors:

Matt Norman (Oak Ridge National Laboratory) Isaac Lyngaas (Oak Ridge National Laboratory) Abhishek Bagusetty (Argonne National Laboratory) Mark Berrill (Oak Ridge National Laboratory)

Overview

YAKL (like Kokkos and RAJA) is a portable C++ library that allows developers to conveniently export code to different hardware backends like CUDA, HIP, and SYCL for single-source portability. YAKL, Kokkos, and RAJA are all just C++ libraries, and the code is purely C++ without any language extensions. For more information about portable C++ libraries, particularly from the perspective of using directives, please read this article.

The YAKL API is similar to Kokkos in many ways, but is quite simplified and has much stronger and Fortran-like behavior in the arrays and parallel loops. YAKL currently has backends for:

CPUs (serial)
CPU OpenMP threading
CUDA
HIP
SYCL
OpenMP offload (in progress)

What does YAKL provide?

Multi-dimensional dynamically allocated arrays in Fortran and C styles
Multi-dimensional statically defined arrays in Fortran and C styles
Kernel launchers to launch code in parallel over threads on different hardware backends
Various methods of transferring data between host and device memory spaces
Basic atomic operations (add, min, and max) using hardware atomics when available
Efficient reductions via convenient syntax patterned after Fortran's sum(), minval(), and maxval() using vendor libraries
Synchronization via a fence() function
Pool allocator that is automatically turned on for all device allocations in separate memory address spaces
Fortran bindings for YAKL allocators and YAKL init and finalize
Limited Fortran intrinsics library
Classes to handle scalars that need to be read after being written to in a parallel kernel.
NetCDF and Parallel NetCDF I/O routines using YAKL's multi-dimensional Arrays
Automated timers for YAKL's parallel_for calls using the General Purpose Timing Library (GPTL)

Example YAKL Code

The following is an example of a section of code in Fortran + OpenACC, parallel YAKL C++ in Fortran-style, and parallel YAKL in Fortran-style:

OpenACC Fortran Code

real stateTend      (nx  ,ny,nz,numState);
real stateFluxLimits(nx+1,ny,nz,numState);

!$acc parallel loop collapse(4)
do l = 1 , numState
  do k = 1 , nz
    do j = 1 , ny
      do i = 1 , nx
        stateTend(i,j,k,l) = - ( stateFluxLimits(i+1,j,k,l) -
                                 stateFluxLimits(i  ,j,k,l) ) / dx;
      enddo
    enddo
  enddo
enddo

Portable C++ Code (Fortran-style YAKL `Array`s)

typedef yakl::Array<float,4,yakl::memDevice,yakl::styleFortran> real4d;
using yakl::fortran::parallel_for;
using yakl::fortran::Bounds;

real4d stateTend      ("stateTend"      ,nx  ,ny,nz,numState);
real4d stateFluxLimits("stateFluxLimits",nx+1,ny,nz,numState);

// do l = 1 , numState
//   do k = 1 , nz
//     do j = 1 , ny
//       do i = 1 , nx
parallel_for( Bounds<4>(numState,nz,ny,nx) ,
              YAKL_LAMBDA(int l, int k, int j, int i) { 
  stateTend(i,j,k,l) = - ( stateFluxLimits(i+1,j,k,l) -
                           stateFluxLimits(i  ,j,k,l) ) / dx;
});

Portable C++ Code (C-style YAKL `Array`s)

typedef yakl::Array<float,4,yakl::memDevice,yakl::styleC> real4d;
using yakl::c::parallel_for;
using yakl::c::Bounds;

real4d stateTend      ("stateTend"      ,numState,nz,ny,nx  );
real4d stateFluxLimits("stateFluxLimits",numState,nz,ny,nx+1);

// for (int l=0; l < numState; l++) {
//   for (int k=0; k < nz; k++) {
//     for (int j=0; j < ny; j++) {
//       for (int i=0; i < nx; i++) {
parallel_for( Bounds<4>(numState,nz,ny,nx) ,
              YAKL_LAMBDA(int l, int k, int j, int i) { 
  stateTend(l,k,j,i) = - ( stateFluxLimits(l,k,j,i+1) -
                           stateFluxLimits(l,k,j,i  ) ) / dx;
});

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

YAKL (Yet Another Kernel Launcher)

A Simple C++ Framework for Performance Portability and Fortran Code Porting

Overview

What does YAKL provide?

Example YAKL Code

OpenACC Fortran Code

Portable C++ Code (Fortran-style YAKL `Array`s)

Portable C++ Code (C-style YAKL `Array`s)

Clone this wiki locally

Home

YAKL (Yet Another Kernel Launcher)

A Simple C++ Framework for Performance Portability and Fortran Code Porting

Overview

What does YAKL provide?

Example YAKL Code

OpenACC Fortran Code

Portable C++ Code (Fortran-style YAKL Arrays)

Portable C++ Code (C-style YAKL Arrays)

Clone this wiki locally

Portable C++ Code (Fortran-style YAKL `Array`s)

Portable C++ Code (C-style YAKL `Array`s)