The sample demonstrates how to optimize sparse vector - dense vector scaling and sum (cusparseAxpby
) by exploiting NVIDIA Ampere architecture Hardware Memory Compression
Nsight Compute can be used to understand the effect of the memory compression
nv-nsight-cu-cli --metrics lts__gcomp_input_sectors_compression_achieved_algo_sdc4to1.sum,lts__gcomp_input_sectors_compression_achieved_algo_sdc4to2.sum,fbpa__dram_read_sectors.sum,fbpa__dram_write_sectors.sum,lts__average_gcomp_input_sector_compression_rate.pct ./compression_example
-
Command line
nvcc -I<cuda_toolkit_path>/include compression_example.c -o compression_example -lcusparse -lcuda
-
Linux
make
-
Windows/Linux
mkdir build cd build cmake .. make
On Windows, instead of running the last build step, open the Visual Studio Solution that was created and build.
- Supported SM Architectures: SM 8.0, SM 8.6, SM 8.9, SM 9.0
- Supported OSes: Linux, Windows, QNX, Android
- Supported CPU Architectures: x86_64, ppc64le, arm64
- Supported Compilers: gcc, clang, Intel icc, IBM xlc, Microsoft msvc, Nvidia HPC SDK nvc
- Language:
C++14
- CUDA 11.0 toolkit (or above) and compatible driver (see CUDA Driver Release Notes).
- CMake 3.9 or above on Windows