Contributors: Kyriafinis Vasilis, Koro Erika
Winter Semester 2021 - 2022
Table of Contents
The objective of this assignment is to calculate the number of triangles in a given graph. The graphs given are large graphs with unweighted edges. Their matrices are sparse and very large to the point they don't fit in memory. To overcome this problem we use the CSR (Compressed Sparse Row) representation. The graphs are provided in matrix market format and their CSR vectors must be computed. After that, we square the initial sparse matrix and from the computed matrix we keep only the elements in the places that are non zero in the initial matrix.
The first part of the project is to calculate the number of triangles with serial programming and confirm that the proccess works. Afterwards, the workload must be computed using parallel programming for optimal performance using pthreads, openMP and Cilk.
To setup this repository on your local machine run the following commands on the terminal:
git clone https://github.com/Billkyriaf/pds_assignment_1.git
Or alternatively download and extract the zip file of the repository
This project uses make utilities to build and run the excecutables. Alternatively you can use Cmake but openCilk is not supported in the build script.
Make sure you have an OpenMP capable compiler
You can install OpenCilk for linux operating systems by running the following commands (some commands may require sudo
):
- Download the compressed file from GitHub
wget https://github.com/OpenCilk/opencilk-project/releases/download/opencilk%2Fbeta3/OpenCilk-9.0.1-Linux.tar.gz tar xvzf OpenCilk-9.0.1-Linux.tar.gz mv OpenCilk-9.0.1-Linux/ /usr/local/
t is recommended to use the directory specified above otherwise you will have to edit the Makefile.
- Update the permissions
chmod og+xr /usr/local/OpenCilk-9.0.1-Linux/ -R
- (Optional) Remove the compressed file
rm -f OpenCilk-9.0.1-Linux.tar.gz
The ANSI C library for Matrix Market I/O library is used for reading matrix market files that contain the graphs. The project already contains the files requiered for compiling and linking the files in the ext_libraries directory. The Matrix Market standard can be found here. In general coordinate pattern symmetric matrices are used. If the graphs represented in those matrices are Unweighted the weight for each edge is concidered as 1.
.
├─ Graphs # Graph Data are stored here (.mtx .txt files)
├─ PythonDiagrams # Pycharm Project to create plots
├─ TriangleCalculator # Main Project
└─ README.md
.
├── ...
├── TrangleCalculator
| ├── .idea # Idea folder for Clion project
| ├── Julia # Julia Code for the serial implementation of the algorithm
| ├── src
| | ├── OpenCilk # OpenCilk related source files
| | ├── OpenMP # OpenMP related source files
| | ├── Pthreads # Pthread related source files
| | ├── Serial # Serial algorithm source files
| | ├── ext_libraries # Extrernal dependencies source files
| | ├── libraries # Utility source files
| | ├── opencilk_main.c
| | ├── openmp_main.c
| | ├── pthread_main.c
| | └── serial_main.c
| ├── CMakeLists.txt # Cmake build script for Clion
| └── Makefile # Makefile for linux base distros
|
└─ ...
Simply run make all
in the TriangleCalculator directory. The executable files will be created in the linux_build
directory along with the object
files. To run an executable cd
in the linux_build
directory and run the command ./executable_name arg1 arg2
. Run make clean
to clear all the object and executable files. For more information on the command line arguments read bellow
The code was developed on windows and specificaly in CLion. The easiest way to run the source code is to import the project to a Clion project. The CMakeLists.txt provides targets for executing every module of the program. The only extra step required is to create the configurations and add the command line arguments.
- Serial executable
argv[1] = 'Path to the .mtx file of the praph'
- Pthread executable
argv[1] = 'Path to the .mtx file of the praph' argv[2] = number of threads to spawn and divide the work (be reasonable max number of threads is not specified)
- Serial executable
argv[1] = 'Path to the .mtx file of the praph' argv[2] = number of threads to request from the OpenMP API
- Serial executable
argv[1] = 'Path to the .mtx file of the praph' argv[2] = number of threads to request from the OpenCilk API
Note: Relative file paths are not supported on Windows.
The input of the program is a graph.mtx
file. This is a matrix market
file that contains the relations between the nodes oth the graphs. IMPORTANT note: All the graphs must be unweighted and undirected. This is crusial because only then the Sparse matrix of the graph is symmetrical. The code provided in the libraries directory takes this assumption for granted and does not check if it is true. During the development phase the graphs bellow were used to test the code, along with some small graphs created for debugging purposes. All the files can be found in the Graphs directory.
The proceeding table showcases the graphs and creates a reference performance. (The times displayed were given with the assignment)
Name | Excecution time |
---|---|
DIMACS10: Belgiumosm | 2.8s |
SNAP: com-Youtube | 6.5s |
Mycielski: mycielskian13 | 0.36s |
LAW | 0.38s |
DIMACS10: NACA0015 | 2.46s |
Note: The excecution time for the above graphs is measured on a two core laptop.
Along with the graph.mtx
for most of the graphs graph.csr
files are provided. Those files were created for debbuging purposes and store the Sparse matrices ins CSR format together with information regarding the size of the matrix and the number of non zero elements. Fianly the text files in the directory store performance data for executions with all the implementations (Serial and Parallel) for every graph.