Skip to content

Nextflow workflow used to run kneaddata

License

Notifications You must be signed in to change notification settings

FredHutch/nf-kneaddata

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

nf-kneaddata

Nextflow workflow used to run kneaddata

KneadData (BioBaker)

The KneadData utility is used to preprocess metagenomic data for microbiome analysis by removing contaminating host sequences and running quality trimming.

A selection of reference database is provided so that the appropriate host genome can be used for decontamination.

Outputs

Four types of output files will be created (where $INPUTNAME is the basename of $INPUT):

  1. The final file of filtered sequences after trimming
  • $OUTPUT_DIR/$INPUTNAME_kneaddata.fastq
  1. The contaminant sequences from testing against a database
  • $OUTPUT_DIR/$INPUTNAME_kneaddata_$DATABASE_$SOFTWARE_contam.fastq
  1. The log file from the run
  • $OUTPUT_DIR/$INPUTNAME_kneaddata.log
  1. The FASTQ file of trimmed sequences
  • $OUTPUT_DIR/$INPUTNAME_kneaddata.trimmed.fastq

  • Trimmomatic is run with the following arguments by default “SLIDINGWINDOW:4:20 MINLEN:70”. The minimum length is computed as 70 percent of the length of the input reads.

About

Nextflow workflow used to run kneaddata

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published