-
Notifications
You must be signed in to change notification settings - Fork 11
ngs_RMDUP
Stephen Fisher edited this page Aug 7, 2014
·
3 revisions
This module will remove duplicate reads.
Usage: ngs.sh rmdup [-i inputDir] [-se] sampleID Input: sampleID/inputDir/unaligned_1.fq sampleID/inputDir/unaligned_2.fq (paired-end reads) Output: sampleID/rmdup/unaligned_1.fq sampleID/rmdup/unaligned_1.fq sampleID/rmdup/sampleID.rmdup.stats.txt Requires: removeDuplicates.py Options: -i inputDir - location of source files (default: init). -se - single-end reads (default: paired-end)
Remove duplicate reads. Reads are considered duplicates if they exactly match. For paired-end reads, the mate pairs both must exactly match to be considered duplicates. This is very RAM intensive, requiring RAM amounts up to three times the input file size (e.g. if your fastq files total 20GB then up to 60GB RAM may be used when removing duplicates).