Cannon's Algorithm implemented with Cuda-Aware MPI and message compression

Implementation of cannon's algorithm using MPI and CUDA to explore the use of Compression in Cuda-Aware MPI Applications. The compression relies on ndzip, which is included in this repository as a submodule

Prerequisites

CMake >= 3.18
GCC >= 10.3.0
Linux (tested on x86_64)
Boost >= 1.66
Cuda >= 11.3
Cuda-Aware MPI Implementation (tested on OpenMPI v4.1.1)

Building

mkdir build
cmake -B build .
cmake --build build -j

two executables will be build:

Cannons_Algorithm
Cannons_Algorithm_Comp

Both can be used for distributed matrix multiplication using cannon's algorithm. Cannons_Algorithm_Comp compresses the subblocks with ndzip before sending them to another node. The receiving node will then decompress the data.

Running

The application takes two parameters:

size of the matrizes
path to file with testdata

The first parameter sets the size of the matrizes. For example with a given argument of 8192 the programm will execute a distributed multiplication of two matrizes each with a size of 8192 x 8192.

The second parameter expects a path to a binary file with double precision floating point data. With this data the matrizes will be populated. The layout of this data file has to be the same as the double precision floating point datasets provided by Martin Burtscher [Link].

This implementation has been tested on the HPC System JUSUF, located at the Research Center in Jülich.

An example SLURM Jobscript is provided which was used for testing on JUSUF.

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
ndzip @ ff4e670		ndzip @ ff4e670
src		src
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
README.md		README.md
jobscript_cannon.sh		jobscript_cannon.sh
presentation.pdf		presentation.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cannon's Algorithm implemented with Cuda-Aware MPI and message compression

Prerequisites

Building

Running

About

Releases 1

Packages

Languages

vogma/cannon_cuda_compression

Folders and files

Latest commit

History

Repository files navigation

Cannon's Algorithm implemented with Cuda-Aware MPI and message compression

Prerequisites

Building

Running

About

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages