Skip to content

This Bash script automates fastq alignment, sorting of SAM files, and BAM to FASTA conversion using minimap2 and samtools for easy analysis of sequencing data.

Notifications You must be signed in to change notification settings

rajithadp/NanoDengue

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation

NanoDengue

Introduction

nanoDengue is a bash script that performs two operations and generate a consensus FASTA file from Nanopore sequencing fastq data. The first operation is to generate quality control plots using NanoPlot. The second operation is to align fastq files to a reference genome using Minimap2, sort the resulting SAM file using Samtools, and generate a consensus sequence from the resulting BAM file using Samtools.

Requirements

The following software tools are required to run nanoDengue:

  • NanoPlot
  • Minimap2
  • Samtools

Usage

  1. Download the nanoDengue script to your local machine.
  2. Move the script to the directory containing the fastq_pass folder.
  3. Open the terminal and navigate to the directory containing the script.
  4. Run the script using the following command:
  bash nanoDengue.sh

Script Overview

The nanoDengue script contains two main operations that are performed sequentially on each fastq file found in the fastq_pass folder.

Operation 1: NanoPlot

The first operation generates quality control plots using NanoPlot. The following steps are performed:

  1. Loop through all fastq.gz files in the fastq_pass folder.
  2. Create a new output directory for each fastq.gz file in the format "Nano//".
  3. Run NanoPlot on each fastq.gz file using the following parameters: --fastq: specify the input fastq.gz file. --plots: specify the type of plots to generate (kde, hex, dot). --outdir: specify the output directory to save the plots.

#Operation 2: Minimap2 and Samtools The second operation aligns fastq files to a reference genome using Minimap2, sorts the resulting SAM file using Samtools, and generates a consensus sequence from the resulting BAM file using Samtools. The following steps are performed:

  1. Set the path to the reference genome file.
  2. Gunzip all fastq.gz files in the fastq_pass folder.
  3. Loop through all fastq files in the fastq_pass folder.
  4. Generate an output SAM file name based on the input fastq file name.
  5. Run Minimap2 to align the fastq file to the reference genome and output the resulting SAM file.
  6. Generate an output BAM file name based on the input SAM file name.
  7. Sort the SAM file and output the resulting BAM file.
  8. Generate an index file for the BAM file.
  9. Generate an output consensus fasta file name based on the input BAM file name.
  10. Generate a consensus fasta file from the BAM file using Samtools.

#Output The following output files are generated for each fastq file processed by the script:

  • Three quality control plots in PNG format located in the "Nano//" folder.
  • One SAM file located in the same folder as the input fastq file.
  • One BAM file located in the same folder as the input fastq file.
  • One index file located in the same folder as the input fastq file.
  • One consensus fasta file located in the same folder as the input fastq file.

Example Output

If the input fastq file name is "sample.fastq", the output files generated by the script will be:

  • "sample_minimap2.sam"
  • "sample_minimap2_sorted.bam"
  • "sample_minimap2_sorted.bam.bai"
  • "sample_minimap2_sorted.consensus.fasta"

About

This Bash script automates fastq alignment, sorting of SAM files, and BAM to FASTA conversion using minimap2 and samtools for easy analysis of sequencing data.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages