Distributed Text Search - INF560 Project

In this project, we developed a program to quickly search through a large corpus for approximates matches using the Levenshtein distance. We focus on parallelizing the search by splitting the data and the workload between different nodes using MPI, and between different CPU and GPU threads using OpenMP and Cuda. We run experiments to compare different strategies and tweak hyperparameters of our program. Our final program scales well with the number of nodes: it takes only 25% more time to run it when multiplying the number of nodes and patterns by a factor of 8.

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
dna		dna
results		results
src		src
.gitignore		.gitignore
INF560_Project.pdf		INF560_Project.pdf
Makefile		Makefile
README.md		README.md
generate_results.py		generate_results.py
plots.py		plots.py
script.sh		script.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Distributed Text Search - INF560 Project

About

Releases

Packages

Languages

FabienRoger/Distributed-Text-Search

Folders and files

Latest commit

History

Repository files navigation

Distributed Text Search - INF560 Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages