Skip to content

Latest commit

 

History

History
 
 

repeated_DNA_sequences

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Problem

All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

For example,

Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",

Return: ["AAAAACCCCC", "CCCCCAAAAA"].

Solution

Serialize the sequence of 10 chars into a 20 bits int.

Use hashset to keep seen and duplicated sequences.