Skip to content

Bioinformatics pipeline for the extraction of all tumor features necessary for selection of clonal mutations from whole-exome sequencing (WES) data in tumor normal settings

Notifications You must be signed in to change notification settings

maxanes/Feature-Extraction-Workflow-for-Target-Mutation-Selection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Feature-Extraction-Workflow-for-Target-Mutation-Selection

Bioinformatics pipeline for the extraction of all tumor features necessary for the selection of clonal mutations from whole-exome sequencing (WES) data in tumor-normal setting. The pipeline employs two publicly accessible tools, Mutect2 for the detection of single nucleotide variants (SNV) and insertion and deletion (indel) variants (available at https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2), and PureCN for the estimation of tumor features such as cancer cell fraction (CCF), multiplicity of mutations, tumor sample purity, and ploidy (available at https://bioconductor.org/packages/devel/bioc/vignettes/PureCN/inst/doc/Quick.html).

paper link will be available

Requirements

Installed conda (available instructions: https://conda.io/projects/conda/en/latest/user-guide/install/index.html)

Input files

Tumor and normal bam files of queried sample

Provided here

  • A bash script create_interval_and_pon.sh, which is for a one-time run to create necessary files (NormalDB and interval.file) for running sample workflow.py
  • workflow.py that details the gwf pipeline which creates the desired output files for the queried sample (a gwf tutorial is found at https://gwf.app/guide/tutorial)

Quick start

1. Install conda environment from provided gatk-purecn.yml

conda env create -f gatk-purecn.yml

2. In the project folder create a sub-folder for generating NormalDB and the interval file needed to run the pipeline

./create_interval_and_pon.sh

3.To set up Slurm back-end, run the following command (described at https://gwf.app/guide/tutorial)

gwf config set backend slurm

4. After successfully generating of necessary files, run gwf workflow.py for the queried sample(s) after adjusting paths within workflow.py (under config) file

gwf run

About

Bioinformatics pipeline for the extraction of all tumor features necessary for selection of clonal mutations from whole-exome sequencing (WES) data in tumor normal settings

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published