In this project, we developed a program to quickly search through a large corpus for approximates matches using the Levenshtein distance. We focus on parallelizing the search by splitting the data and the workload between different nodes using MPI, and between different CPU and GPU threads using OpenMP and Cuda. We run experiments to compare different strategies and tweak hyperparameters of our program. Our final program scales well with the number of nodes: it takes only 25% more time to run it when multiplying the number of nodes and patterns by a factor of 8.
-
Notifications
You must be signed in to change notification settings - Fork 0
FabienRoger/Distributed-Text-Search
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Brute-force approximate match search - parallelized using MPI, OpenMP and Cuda.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published