Parallel Heterogeneous CPU/GPU computing on diffusion equation with OpenMP, CUDA, Thrust, OpenACC, TBB
This project can use interconnected GPUs by PCIe or Nvlink with P2P connection.
Install NVIDIA HPC kit
https://developer.nvidia.com/hpc-sdk
And setup the CUDA_PATH toward the hpc kit directory
Example:
export CUDA_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/22.2/cuda
- nvcc required for CUDA and thrust
- nvc++ required for openACC
sudo apt install libtbb-dev
or
Follow this : https://www.intel.com/content/www/us/en/developer/articles/guide/get-started-with-tbb.html
Any compatible compiler like GNU g++ or LLVM clang++
Each project can be built as library separately :
- OpenACC in directory acc
- OpenMP in directory omp
- Thrust CPU (TBB/OpenMP) in directory thrust_cpu
- Thrust GPU (CUDA) in directory thrust_gpu
- OpenACC in directory acc
All computation model and library can be built in one cmake but every dependencies is required and the binary will be able to be executed
The cmake will select the specific required compiler for each subproject (g++, clang++, nvcc, nvc++)
C++ 17 was used due to usage of thrust template and the usage of SFINAE template technical style.
If you have configured clang++ to be able to compile cuda code you can replace
-DCMAKE_CUDA_COMPILER=nvcc
by
-DCMAKE_CUDA_COMPILER=clang++
cd c++ && \
cmake \
-DCMAKE_BUILD_TYPE=RelWithDebInfo \
-DCMAKE_CUDA_COMPILER=nvcc \
-B build -S . && \
cmake --build build
./build/bin/stencil 10000 10000