Simple extension to Biostrings R package
Provide simple functions easing %GC calculation and plotting. %GC can be calculated for single DNAString or DNAStringSet objects. For the DNAString objects it is possible to calculate %GC using sliding window approach and plot %GC distribution with ggplot.
Another functionality allows to find the most frequent pattern of given length (k) in the provided sequence. It also allows plotting the positions of given pattern in the sequence.
Splits a DNAString object using sliding window approach. Window parameter defines the size of the window whereas overlap defines the overlap between consecuting windows.
Returns %GC for provided DNAString object (x)
Returns %GC for provided DNAString or DNAStringSet object (x). In the case of DNAStringSet the vector containing %GC for each sequence in the set is returned
Calculates %GC of provided DNAString object (x) using sliding window approach. Window parameter defines the size of the window whereas overlap defines the overlap between consecuting windows. Return numeric vector containing %GC for each created chunk
Calculates %GC of provided DNAString object (x) using sliding window approach and creates the plot. Window parameter defines the size of the window whereas overlap defines the overlap between consecuting windows.
Finds the most frequently occuring pattern of size k in the DNAString object (x)