Introduction
DNA is a polymer composed of two polynucleotide chains that coil around each other to form a double helix. The two DNA strands are known as polynucleotides as they are composed of simpler monomeric units called nucleotides.
Each nucleotide is composed of one of four nitrogen-containing nucleobases (cytosine [C], guanine [G], adenine [A] or thymine [T]), a sugar called deoxyribose, and a phosphate group. The nucleotides are joined to one another in a chain by covalent bonds (known as the phosphodiester linkage) between the sugar of one nucleotide and the phosphate of the next, resulting in an alternating sugar-phosphate backbone.
The algorithms that we will be discussing in this paper includes,
- Brute force (Naive Algorithms),
- Knuth Morris Pattern (KMP)
- Rabin Karp
- Boyer-Moore
We will have a thorough discussion on these, including their application, Pseudo code, importance and problems with the algorithms. We will also discuss why KMP is also called a DNA sequencing algorithm and how it gives a match in less time.
Some Advanced topics from the Literature Survey were also studied in terms of the algoritms performace