RRBS_nf

Nextflow pipeline for analyzing RRBS data from NuGen Ovation library

Analysis steps

FastQC before trimming (FastQC)
Trimming (trim_galore)
- For NuGen Ovation kit, use adaptive trimming from manufacturer.
- For Illumina Epic Truseq kit, use trim_galore auto trimming.
FastQC after trimming (FastQC)
Bismark alignment (Bismark)
Bismark methylation extraction (Bismark)
Generate bigwig files for visualization (ucsc utils)
Generate summary statistics based on group conditions

Command line options

--genomedir Directory for genome file, can be fa, fasta, fa.gz, fasta.gz.
--library Library kit used, can be nugen or epic. Default is nugen. This will affect how trimming is performed.
--single_end If library is single-end, use this flag. Default is not on.
--reads FASTQ files in fq, fastq, fq.gz or fastq.gz format. If library is single-end, please use --single_end --reads '*.fq.gz'. If library is paired-end, please use --reads '*{1,2}.fq.gz'.
--outdir Output directory. Default is results.
--aligner Aligner used by Bismark, can be hisat2 or bowtie2. Default is hisat2.
--species Species to use in summary statistics, can be mm10 or hg38.
--samplesheet Sample sheet csv file contains sample grouping information. No heading, first column is fastq file name (for paired-end data, only use R1 file name), second column is group. Currently require > 2 samples in each group.

Optional augument related to nextflow

-resume Resume previous analysis.
-profile Profile to use. Profiles can be edited in nextflow.config file. Use -profile slurm to use slurm HPC scheduler.

Inputs

This pipeline works with FASTQ files (fastq, fq, fastq.gz, fq.gz)

If library is single-end, please use --single_end --reads '*.fq.gz'.
If library is paired-end, please use --reads '*{1,2}.fq.gz'.
Please use quotes for the input file names with wildcards.

Bismark parameters

Support hisat2 (default) and bowtie2

Quick start

nextflow main.nf --reads 'data/*R{1,2}.fq.gz' --genomedir data/ref/ -profile slurm --samplesheet samplesheet.csv --species mm10

nextflow main.nf --single_end --reads 'data/*.fq.gz' --genomedir data/ref/ -profile slurm --samplesheet samplesheet.csv --species mm10

-profile slurm enbales the use of powerful resourse management software SLURM, and the relavant setting for specific task can be changed in nextflow.config file.

Updates:

Adapted from nf-core methyl-seq pipeline (https://github.com/nf-core/methylseq)
Simplified and added NuGen specific trimming process (https://github.com/nugentechnologies/NuMetRRBS)

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
bin		bin
data		data
.gitignore		.gitignore
README.md		README.md
env.yml		env.yml
flowchart.html		flowchart.html
flowchart.png		flowchart.png
main.nf		main.nf
nextflow.config		nextflow.config
nextflow.config.bk		nextflow.config.bk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RRBS_nf

Analysis steps

Command line options

Inputs

Bismark parameters

Quick start

Updates:

About

Releases 2

Packages

Languages

alexyfyf/RRBS_nf

Folders and files

Latest commit

History

Repository files navigation

RRBS_nf

Analysis steps

Command line options

Inputs

Bismark parameters

Quick start

Updates:

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages