Skip to content

Zymo-Research/aladdin-bisulfitebolt

Repository files navigation

nxf-bisulfitebolt Nextflow Pipeline

This is a bioinformatics analysis pipeline used for analyzing bisulfite sequencing data. The pipeline is built with Nextflow and BS Bolt suite.

Analysis components

The pipeline includes the following analyes:

  1. Trimming input sequencing reads using Cutadapt.

  2. Evaluating reads quality after trimming using FastQC.

  3. Aligning trimmed reads and create methylation matrix using BS Bolt.

  4. Generating a MultiQC report with all important metrics.

Generating Index

We generate the index before running the pipelie. Here are the parameters: RRBS Index, MSPI Cut Format, 40bp Lower Fragment Bound, and 400bp Upper Fragment Bound.

python3 -m bsbolt Index -G hg38_selected.fa -DB DB_indexFull_test -rrbs -rrbs-cut-format C-CGG -rrbs-lower 40 -rrbs-upper 400

Quick Start

To run the pipeline, clone the dev branch and run the following command:

nextflow run main.nf --index /path/to/index/genome/ \
                        --project project_name \
                        --design /path/to/design.csv \
                        2&>1 | tee headnode.log

The design CSV file must have the following format.

group,sample,read_1,read_2
Control,Sample1,s3://mybucket/this_is_s1_R1.fastq.gz,s3://mybucket/this_is_s1_R2.fastq.gz
Control,Sample2,s3://mybucket/this_is_s2_R1.fastq.gz,s3://mybucket/this_is_s2_R2.fastq.gz
Experiment,Sample3,s3://mybucket/that_is_s3_R1.fastq.gz,
Experiment,Sample4,s3://mybucket/that_be_s4_R1.fastq.gz,