Skip to content

tpetra_kokkos

Glen Hansen edited this page Feb 11, 2019 · 2 revisions

The tpetra_kokkos branch was created in 2014 and is branch of Albany that is based on the new Kokkos-based implementation of the Phalanx package. The goal is to provide Albany with performance portability across multicore architectures (e.g., Intel Phi, Nvidia GPUs, multicore CPUs).

More information about Kokkos can be found here.

The cmake configure script for building Trilinos in this branch requires some additional cmake flags. Please use this configure script as a reference.

 TRILINOS_HOME=TrilinosDir/Trilinos
 BOOSTDIR=/path_to_boost/boost_1_55
 NETCDF=/path_to_netcdf/netcdf
 HDFDIR=/path_to_hdf5/hdf5
 HWLOC_PATH="/path_to_hwloc/"
 export BOOST_ROOT=$BOOST_DIR
 EXTRA_ARGS=$@

 rm -rf CMakeFiles CMakeCache.txt

 cmake \
    -D Trilinos_DISABLE_ENABLED_FORWARD_DEP_PACKAGES=ON \
    -D CMAKE_INSTALL_PREFIX:PATH=/home/ipdemes/TrilinosDir/BuildTrilinos_Albany/install_OpenMP \
    -D CMAKE_BUILD_TYPE:STRING="NONE"  \
    -D BUILD_SHARED_LIBS:BOOL=ON \
    -D Trilinos_ENABLE_ALL_PACKAGES:BOOL=OFF \
    -D Trilinos_ENABLE_EXPLICIT_INSTANTIATION:BOOL=OFF \
    -D Trilinos_WARNINGS_AS_ERRORS_FLAGS:STRING="" \
    -D Trilinos_ENABLE_Teuchos:BOOL=ON \
    -D Trilinos_ENABLE_Shards:BOOL=ON \
    -D Trilinos_ENABLE_Epetra:BOOL=ON \
    -D Trilinos_ENABLE_Tpetra:BOOL=ON \
    -D Tpetra_ENABLE_Kokkos_Refactor:BOOL=ON \
    -D Trilinos_ENABLE_EpetraExt:BOOL=ON \
    -D Trilinos_ENABLE_Ifpack:BOOL=ON \
    -D Trilinos_ENABLE_AztecOO:BOOL=ON \
    -D Trilinos_ENABLE_Amesos:BOOL=ON \
    -D Trilinos_ENABLE_Anasazi:BOOL=ON \
    -D Trilinos_ENABLE_Belos:BOOL=ON \
    -D Trilinos_ENABLE_ML:BOOL=ON \
    -D Trilinos_ENABLE_Phalanx:BOOL=ON \
    -D Phalanx_ENABLE_EXAMPLES:BOOL=ON \
    -D Phalanx_ENABLE_TESTS:BOOL=OFF \
    -D Phalanx_ENABLE_COMPILETIME_ARRAY_CHECK:BOOL=ON\
    -D Phalanx_KOKKOS_DEVICE_TYPE:STRING="OPENMP" \
    -D Phalanx_INDEX_SIZE_TYPE="UINT" \
    -D Boost_INCLUDE_DIRS:PATH=$BOOSTDIR/include \
    -D TPL_Boost_LIBRARY_DIRS:FILEPATH=$BOOSTDIR/lib \
    -D TPL_ENABLE_BoostLib:BOOL=ON \
    -D BoostLib_INCLUDE_DIRS:FILEPATH="$BOOSTDIR/include" \
    -D BoostLib_LIBRARY_DIRS:FILEPATH="$BOOSTDIR/lib" \
    -D Trilinos_ENABLE_Intrepid:BOOL=ON \
    -D Intrepid_ENABLE_TESTS:BOOL=OFF \
    -D Intrepid_ENABLE_EXAMPLES:BOOL=OFF \
    -D HAVE_INTREPID_KOKKOSCORE:BOOL=ON \
    -D Trilinos_ENABLE_NOX:BOOL=ON \
    -D Trilinos_ENABLE_Stratimikos:BOOL=ON \
    -D Trilinos_ENABLE_Thyra:BOOL=ON \
    -D Trilinos_ENABLE_ThyraTpetraAdapters:BOOL=ON \
    -D Trilinos_ENABLE_Rythmos:BOOL=ON \
    -D Trilinos_ENABLE_MOOCHO:BOOL=OFF \
    -D Trilinos_ENABLE_Stokhos:BOOL=ON \
    -D Trilinos_ENABLE_Piro:BOOL=ON \
    \
    -D Trilinos_ENABLE_STKIO:BOOL=ON \
    -D Trilinos_ENABLE_STKMesh:BOOL=ON \
    -D Trilinos_ENABLE_Teko:BOOL=ON \
    -D Trilinos_ENABLE_SEACASIoss:BOOL=ON \
    -D Trilinos_ENABLE_SEACASExodus:BOOL=ON \
    -D SEACASExodus_PARALLEL_AWARE:BOOL=OFF \
    -D Trilinos_ENABLE_TriKota:BOOL=OFF \
    -D TriKota_ENABLE_DakotaCMake:BOOL=OFF \
    -D Trilinos_ENABLE_OpenMP:BOOL=ON \
    -D TPL_ENABLE_OpenMP:BOOL=ON \
    -D Trilinos_ENABLE_Kokkos:BOOL=ON \
    -D Trilinos_ENABLE_KokkosClassic:BOOL=ON \
    -D Trilinos_ENABLE_KokkosCore:BOOL=ON \
    -D Trilinos_ENABLE_KokkosContainers:BOOL=ON \
    -D Trilinos_ENABLE_KokkosCompat:BOOL=ON \
    -D Trilinos_ENABLE_KokkosTPL:BOOL=ON \
    -D Trilinos_ENABLE_KokkosLinAlg:BOOL=ON \
    -D Trilinos_ENABLE_KokkosAlgorithms:BOOL=ON \
    -D Trilinos_ENABLE_KokkosMpiComm:BOOL=ON \
    -D Trilinos_ENABLE_KokkosExample:BOOL=OFF \
    -D KokkosClassic_DefaultNode:STRING="Kokkos::Compat::KokkosOpenMPWrapperNode" \
    -D TPL_HWLOC_LIBRARIES:PATHNAME="${HWLOC_PATH}/lib/libhwloc.so" \
    -D TPL_HWLOC_INCLUDE_DIRS:PATHNAME="${HWLOC_PATH}/include" \
    -D TPL_ENABLE_HWLOC:STRING=ON \
    -D Trilinos_ENABLE_Sacado:BOOL=ON \
    -D DAKOTA_ENABLE_TESTS:BOOL=OFF \
    -D TPL_ENABLE_Boost:BOOL=ON \
    -D TPL_ENABLE_Netcdf:BOOL=ON \
    -D Netcdf_INCLUDE_DIRS:PATH="$NETCDF/include" \
    -D Netcdf_LIBRARY_DIRS:PATH="$NETCDF/lib" \
    -D TPL_ENABLE_HDF5:BOOL=OFF \
    -D HDF5_INCLUDE_DIRS:PATH="$HDFDIR/include" \
    -D HDF5_LIBRARY_DIRS:PATH="$HDFDIR/lib" \
    -D Trilinos_ENABLE_Mesquite:BOOL=OFF \
    -D Trilinos_ENABLE_EXAMPLES:BOOL=OFF \
    -D Trilinos_ENABLE_TESTS:BOOL=OFF \
    -D Piro_ENABLE_TESTS:BOOL=OFF \
    -D TPL_ENABLE_BinUtils:BOOL=OFF \
    -D TPL_ENABLE_MPI:BOOL=ON \
    -D CMAKE_C_COMPILER="mpicc" \
    -D CMAKE_CXX_COMPILER="mpicxx" \
    -D CMAKE_Fortran_COMPILER="mpif90" \
    -D TPL_ENABLE_CUSPARSE:STRING=OFF \
    -D Kokkos_ENABLE_EXAMPLES:BOOL=OFF \
    -D Kokkos_ENABLE_TESTS:BOOL=OFF \
    -D Kokkos_ENABLE_CUDA:BOOL=OFF \
    -D Kokkos_ENABLE_OpenMP:BOOL=ON \
    -D Kokkos_ENABLE_Thrust=OFF \
    -D TPL_ENABLE_CUDA:STRING=OFF \
    -D CUDA_VERBOSE_BUILD:BOOL=OFF \
    -D CUDA_PROPAGATE_HOST_FLAGS:BOOL=OFF \
    -D CUDA_NVCC_FLAGS:STRING=${CUDA_NVCC_FLAGS} \
    -D SEACASExodus_PARALLEL_AWARE:BOOL=OFF \
    -D CMAKE_VERBOSE_MAKEFILE:BOOL=OFF \
    -D CMAKE_CXX_FLAGS:STRING="-g -O3 -fno-var-tracking" \
    -D BLAS_LIBRARY_NAMES:STRING="libf77blas.so.3" \
    -D BLAS_LIBRARY_DIRS:PATH="/usr/lib64/atlas" \
    -D LAPACK_LIBRARY_NAMES:STRING="liblapack.so.3" \
    -D LAPACK_LIBRARY_DIRS:PATH="/usr/lib64/atlas" \
    -D Trilinos_ENABLE_Panzer:BOOL=OFF \
    -D Panzer_ENABLE_TESTS:BOOL=OFF \
    -D Panzer_ENABLE_EXAMPLES:BOOL=OFF \
    -D Panzer_ENABLE_EXPLICIT_INSTANTIATION:BOOL=ON \
    -D Panzer_ENABLE_FADTYPE:STRING="Sacado::Fad::DFad<RealType>" \
    -D Trilinos_ENABLE_CXX11:BOOL=ON \
    -D Kokkos_ENABLE_CXX11:BOOL=ON \
    -D Amesos2_ENABLE_KLU2:BOOL=ON \
    -D ENABLE_64BIT_INT:BOOL=ON \
    -D Trilinos_ENABLE_Export_Makefiles:BOOL=ON \
   \  
   \
   $EXTRA_ARGS \
   ${TRILINOS_HOME}

List of the major changes to the Albany code, required during Kokkos refactoring:

  1. replaced PHX::TypeString::value with PHX::typeAsString();

    Example: this->setName("Gather Solution"+PHX::TypeString::value);

    should be replaced with

    this->setName("Gather Solution"+PHX::typeAsString());

  2. There is no operator[] in Kokkos:View so we need to place usage of operator[] for MDField with something else.

    Example for (std::size_t i=0; i < Residual.size(); ++i) Residual[i]=0.0

    should be replaced with

    Residual.deep_copy(0.0);

    or

    for (int i=0; i<Residual.dimension(0); i++) for (int j=0; j<Residual.dimension(1); j++) Residual(i,j)=0.0;

  3. We can't use pointers with Kokkos data types, so we need to replace pointer usage with direct access

    Example 1:

    MeshScalarT* X = &coordVec(cell,qp,0);
    f[0] =  40.0*muqp*(2.0*X[1]*X[1] - 3.0*X[1]+1.0)*X[1]*(6.0*X[0]*X[0] -6.0*X[0] + 1.0)
           + 120*muqp*(X[0]-1.0)*(X[0]-1.0)*X[0]*X[0]*(2.0*X[1]-1.0) 
    

could be replaced with

    typename PHAL::Ref<MeshScalarT>::type X0 = coordVec(cell,qp,0);
    typename PHAL::Ref<MeshScalarT>::type X1 = coordVec(cell,qp,1);
    force(cell,qp,0) =  40.0*muqp*(2.0*X1*X1 - 3.0*X1+1.0)*X1*(6.0*X0*X0 -6.0*X0 + 1.0)
                     + 120*muqp*(X0-1.0)*(X0-1.0)*X0*X0*(2.0*X1-1.0) 


  Example 2:

    if (this->tensorRank == 2)       valptr = this->valTensor[0](cell,node,eq/numDim,eq%numDim);
    else if (this->tensorRank == 1)  valptr = this->valVec[0](cell,node,eq);
    else                             valptr = this->val[eq](cell,node);

could be replaced with

    typename PHAL::Ref<ScalarT>::type valptr =
       (this->tensorRank == 2) ? this->valTensor[0](cell,node,eq/numDim,eq%numDim) :
       (this->tensorRank == 1) ? this->valVec[0](cell,node,eq) :
                                 this->val[eq](cell,node);

In these examples, the syntax typename PHAL::Ref<ScalarT>::type evaluates to a type that is semantically like a reference. If ScalarT is a POD, then it is a reference; if it is a FadType, for example, then it is a Kokkos::View.

  1. MiniTensor changes:

    Tensor B(*A(i,j,0,0)); B.fill (*C(i,j,0,0));

    should be replaced with:

    Tensor B(*A,i,j,0,0); B.fill (*C,i,j,0,0);

  2. Teuchos::reduceAll calls on MDFields should be changed. For example, this code:

Teuchos::RCP< Teuchos::ValueTypeSerializer<int,ScalarT> > serializer =
  workset.serializerManager.template getValue<EvalT>();
Teuchos::reduceAll(
  *workset.comm, *serializer, Teuchos::REDUCE_SUM,
  this->global_response.size(), &this->global_response[0], 
  &this->global_response[0]);

becomes

PHAL::reduceAll<ScalarT>(*workset.comm, Teuchos::REDUCE_SUM,
                         this->global_response);
  1. to be continued
Clone this wiki locally