Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snippy_Variants: Calculate % reads aligned #616

Merged
merged 8 commits into from
Oct 3, 2024
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,8 @@ For all cases:
| snippy_tree_snippy_docker | String | Docker file used for Snippy in the Snippy_Tree subworkfow |
| snippy_tree_snippy_version | String | Version of Snippy_Tree subworkflow used |
| snippy_variants_outdir_tarball | Array[File] | A compressed file containing the whole directory of snippy output files. This is used when running Snippy_Tree |
| snippy_variants_percent_reads_aligned | Float | Percentage of reads aligned to the reference genome |
| snippy_variants_percent_ref_coverage| Float | Proportion of the reference genome covered by reads with a depth greater than or equal to the `min_coverage` threshold (default is 10). |
| snippy_variants_snippy_docker | Array[String] | Docker file used for Snippy in the Snippy_Variants subworkfow |
| snippy_variants_snippy_version | Array[String] | Version of Snippy_Tree subworkflow used |
| snippy_wg_snp_matrix | File | CSV file of whole genome pairwise SNP distances between samples, calculated from the final alignment |
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ The `Snippy_Variants` workflow aligns single-end or paired-end reads (in FASTQ f
| snippy_variants_num_reads_aligned | Int | Number of reads that aligned to the reference genome as calculated by samtools view -c command |
| snippy_variants_num_variants | Int | Number of variants detected between sample and reference genome |
| snippy_variants_outdir_tarball | File | A compressed file containing the whole directory of snippy output files. This is used when running Snippy_Tree |
| snippy_variants_percent_reads_aligned | Float | Percentage of reads aligned to the reference genome |
| snippy_variants_percent_ref_coverage| Float | Proportion of the reference genome covered by reads with a depth greater than or equal to the `min_coverage` threshold (default is 10). |
| snippy_variants_query | String | Query strings specified by the user when running the workflow |
| snippy_variants_query_check | String | Verification that query strings are found in the reference genome |
Expand Down
14 changes: 12 additions & 2 deletions tasks/gene_typing/variant_detection/task_snippy_variants.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ task snippy_variants {
# Compress output dir
tar -cvzf "./~{samplename}_snippy_variants_outdir.tar" "./~{samplename}"

# compute number of reads aligned to reference
samtools view -c "~{samplename}/~{samplename}.bam" > READS_ALIGNED_TO_REFERENCE
# compute number of reads aligned to reference (excluding unmapped reads)
samtools view -c -F 4 "~{samplename}/~{samplename}.bam" > READS_ALIGNED_TO_REFERENCE

# create coverage stats file
samtools coverage "~{samplename}/~{samplename}.bam" -o "~{samplename}/~{samplename}_coverage.tsv"
Expand Down Expand Up @@ -93,6 +93,15 @@ task snippy_variants {
echo $reference_length_passed_depth $reference_length | awk '{ print ($1/$2)*100 }' > PERCENT_REF_COVERAGE
fi

# Compute percentage of reads aligned
reads_aligned=$(cat READS_ALIGNED_TO_REFERENCE)
total_reads=$(samtools view -c "~{samplename}/~{samplename}.bam")
if [ "$total_reads" -eq 0 ]; then
echo "Could not compute percent reads aligned: total reads is 0" > PERCENT_READS_ALIGNED
else
echo $reads_aligned $total_reads | awk '{ print ($1/$2)*100 }' > PERCENT_READS_ALIGNED
fi

>>>
output {
String snippy_variants_version = read_string("VERSION")
Expand All @@ -111,6 +120,7 @@ task snippy_variants {
String snippy_variants_ref_length = read_string("REFERENCE_LENGTH")
String snippy_variants_ref_length_passed_depth = read_string("REFERENCE_LENGTH_PASSED_DEPTH")
String snippy_variants_percent_ref_coverage = read_string("PERCENT_REF_COVERAGE")
String snippy_variants_percent_reads_aligned = read_string("PERCENT_READS_ALIGNED")
sage-wright marked this conversation as resolved.
Show resolved Hide resolved
}
runtime {
docker: "~{docker}"
Expand Down
3 changes: 3 additions & 0 deletions workflows/phylogenetics/wf_snippy_streamline.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,9 @@ workflow snippy_streamline {
Array[File] snippy_variants_outdir_tarball = snippy_variants_wf.snippy_variants_outdir_tarball
Array[String] snippy_variants_snippy_version = snippy_variants_wf.snippy_variants_version
Array[String] snippy_variants_snippy_docker = snippy_variants_wf.snippy_variants_docker
Array[Float] snippy_variants_percent_reads_aligned = snippy_variants_wf.snippy_variants_percent_reads_aligned
Array[Float] snippy_variants_percent_ref_coverage = snippy_variants_wf.snippy_variants_percent_ref_coverage


### snippy_tree wf outputs ###
String snippy_tree_snippy_version = snippy_tree_wf.snippy_tree_snippy_version
Expand Down
1 change: 1 addition & 0 deletions workflows/standalone_modules/wf_snippy_variants.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ workflow snippy_variants_wf {
File snippy_variants_coverage_tsv = snippy_variants.snippy_variants_coverage_tsv
Int snippy_variants_num_variants = snippy_variants.snippy_variants_num_variants
Float snippy_variants_percent_ref_coverage = snippy_variants.snippy_variants_percent_ref_coverage
Float snippy_variants_percent_reads_aligned = snippy_variants.snippy_variants_percent_reads_aligned
# snippy gene query outputs
String? snippy_variants_query = snippy_gene_query.snippy_variants_query
String? snippy_variants_query_check = snippy_gene_query.snippy_variants_query_check
Expand Down