k-Wave is an open source MATLAB toolbox designed for the time-domain simulation of propagating acoustic waves in 1D, 2D, or 3D. The toolbox has a wide range of functionality, but at its heart is an advanced numerical model that can account for both linear or nonlinear wave propagation, an arbitrary distribution of heterogeneous material parameters, and power law acoustic absorption. See k-Wave website(http://www.k-wave.org).
This project is a part of the k-Wave toolbox accelerating 2D/3D simulations using an optimized CUDA/C++ implementation to run small to moderate grid sizes (e.g., 128x128 to 10,000x10,000 in 2D or 64x64x64 to 512x512x512 in 3D) on systems with a single NVidia GPU. Axisymmetric coordinate systems are not supported.
.
+--Containers - Matrix and output stream containers
+--Data - Small test data
+--GetoptWin64 - Windows version of the getopt routine
+--Hdf5 - HDF5 classes (file access)
+--KSpaceSolver - Solver classes with all the kernels
+--Logger - Logger class for reporting progress and errors
+--MatrixClasses - Matrix classes holding simulation data
+--OutputStreams - Output streams for sampling data
+--Parameters - Parameters of the simulation
+--Utils - Utility routines
Changelog.md - Change log
License.md - License file
Makefile - GNU Makefile
Readme.md - Read me
Doxyfile - Doxygen documentation file
header_bg.png - Doxygen logo
main.cpp - Main file of the project
The source codes of kspaceFirstOrder-CUDA
are written using the C++-11
standard and use the NVIDIA CUDA 10.x and HDF5 1.10.x libraries. Optionally,
the code can be compiled with the support for the OpenMP 4.0 library, however,
only on Linux systems.
There are a variety of different C++ compilers that can be used to compile the source codes. The minimum requirements are the GNU C++ compiler 6.0 or the Intel C++ compiler 2018. However, we recommend using either the GNU C++ compiler version 8.3, the Intel C++ compiler version 2019, or the Visual Studio C++ 2017 compiler. Please note that Visual Studio compilers do not support the OpenMP 4.0 standard and the multithreading has to be disabled. Also be aware that the compiler version may be further limited by the CUDA library. The codes can be compiled on 64-bit Linux and Windows. 32-bit systems are not supported due to the memory requirements even for small simulations.
This section describes the compilation procedure using GNU and Intel compilers on Linux. Windows users are encouraged to download the Visual Studio 2017 project.
Before compiling the code, it is necessary to install a C++ compiler and the CUDA and HDF5 libraries. The GNU compiler is usually part of Linux distributions and distributed as open source. It can be downloaded from http://gcc.gnu.org/ if necessary.
The Intel compiler can be downloaded from https://software.intel.com/en-us/parallel-studio-xe. The Intel compiler is only free for non-commercial and open-source use.
The CUDA library can be downloaded from the (https://developer.nvidia.com/cuda-toolkit-archive). The only supported version are 9.0 - 10.2, however, the code is supposed to work with upcoming CUDA 11.0, but we cannot guarantee that.
-
Download the 64-bit HDF5 library https://www.hdfgroup.org/downloads/hdf5/source-code/). Please keep in mind that versions 1.10.x may not be fully compatible with older versions of MATLAB, especially when compression is enabled. In such a case, please download version 1.8.x https://portal.hdfgroup.org/display/support/HDF5+1.8.21.
-
Configure the HDF5 distribution. Enable the high-level library and specify an installation folder by typing:
./configure --enable-hl --prefix=folder_to_install
-
Make the HDF5 library by typing:
make -j
-
Install the HDF5 library by typing:
make install
- Download CUDA version 10.2 (https://developer.nvidia.com/cuda-toolkit-archive).
- Follow the NVIDIA official installation guide for Windows (http://docs.nvidia.com/cuda/cuda-installation-guide-microsoft-windows/index.html) and Linux (http://docs.nvidia.com/cuda/cuda-installation-guide-linux/).
After the libraries and the compiler have been installed, you are ready to
compile the kspaceFirstOrder3D-CUDA
code.
-
Open Makefile.
-
The Makefile supports code compilation under the GNU compiler or the Intel compiler. Uncomment the desired compiler by removing the character
#
. The GNU compiler is default due to Intel 2018's compatibility issues with Ubuntu 18.04.COMPILER = GNU #COMPILER = Intel
-
Select how to link the libraries. Static linking is preferred as it may be a bit faster and does not require CUDA to be installed on the target system, however, the CUDA driver has to be loaded and initialized every time the code is invoked. On some systems, e.g. HPC clusters, it may be better to use dynamic linking and use the system specific libraries at runtime. SEMI lining provides as fast start-up as dynamic linking but does not require other libraries to be installed on the target system.
# Everything will be linked statically, may not work on all GPUs #LINKING = STATIC # Everything will be linked dynamically #LINKING = DYNAMIC # Everything but CUDA will be linked statically LINKING = SEMI
-
Set installation paths of the libraries (an example is shown below). Zlib and Szip may be required if the compression is switched on. If using EasyBuild and Lmod (Environment Module System) to manage your software, please load appropriate modules before running make. The makefile will set the paths automatically.
CUDA_DIR = $(CUDA_HOME) HDF5_DIR = $(EBROOTHDF5) ZLIB_DIR = $(EBROOTZLIB) SZIP_DIR = $(EBROOTSZIP)
-
Select the instruction set and the CPU architecture. For users who will only use the binary on the same machine as compiled, the best choice is
CPU_ARCH = native
. If you are about to run the same binary on different machines or you want to cross-compile the code, you are free to use any of the possible choices, where AVX is the most general but slowest and AVX512 is the most recent instruction set and (most likely) the fastest. Since the vast majority of computation is done on GPU, the benefit of using newer than AVX instruction set is disputable.CPU_ARCH = native #CPU_ARCH = AVX #CPU_ARCH = AVX2 #CPU_ARCH = AVX512
-
If using a different version of CUDA than 10.x, it may be necessary to removeor add additional CUDA GPU architectures to support most recent GPUs.
# What CUDA GPU architectures to include in the binary CUDA_ARCH = --generate-code arch=compute_30,code=sm_30 \ --generate-code arch=compute_32,code=sm_32 \ --generate-code arch=compute_35,code=sm_35 \ --generate-code arch=compute_37,code=sm_37 \ --generate-code arch=compute_50,code=sm_50 \ --generate-code arch=compute_52,code=sm_52 \ --generate-code arch=compute_53,code=sm_53 \ --generate-code arch=compute_60,code=sm_60 \ --generate-code arch=compute_61,code=sm_61 \ --generate-code arch=compute_62,code=sm_62 \ --generate-code arch=compute_70,code=sm_70 \ --generate-code arch=compute_72,code=sm_72 \ --generate-code arch=compute_75,code=sm_75
-
Close the makefile and compile the source code by typing:
make -j
If you want to clean the distribution, type:
make clean
The CUDA codes offers a lot of parameters and output flags to be used. For more information, please type:
./kspaceFirstOrder-CUDA --help