Universal Snakemake pipeline for processing paired-end RNA-seq data deposited at SRA

This pipeline will download the data, align the reads agaist any chosen genome and report gene counts as well as TPM for all samples.

Multiple runs (technical replicates) will be merged.

Setup

clone this directory.
download a RunTable from SRA that comprises teh samples of interest and place it into this directory.
edit the file config.yaml providing the web locations of genome fasta and gff file.
create a conda environment specified in sra_star_rsem.yaml.

conda env create --name sra_star_rsem --file sra_star_rsem.yaml

activate the conda environment

conda activate sra_star_rsem

snakemake -np
snakemake --dag | dot -Tsvg > dag.svg

with 16 cores

snakemake --cores 16

snakemake -j 999 --cluster-config cluster.json --cluster "sbatch -p {cluster.partition} -n {cluster.n} --mem {cluster.mem}"

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
Snakefile		Snakefile
cluster.json		cluster.json
config.yaml		config.yaml
run.sh		run.sh
sra_star_rsem.yaml		sra_star_rsem.yaml