From 2fd9f757103eb8166e6e317c4129e0201240adfb Mon Sep 17 00:00:00 2001 From: Curtis Kapsak Date: Fri, 8 Nov 2024 14:36:18 -0500 Subject: [PATCH 1/2] [TheiaCoV/TheiaProk/TheiaMeta/TheiaEuk/Freyja_FASTQ] `fastq-scan` updates & improvements. Adding JSON as wf output file (#662) * lots of updates to fastq_scan_pe task: upgrade to fastq-scan 1.0.1; reduce disk to 50gb and cpu to 1; added set -euo pipefail; removed capture of date; added debug statements to cleanup STDOUT/logs; removed unnecessary cat commands with parsing output JSON; renamed 2 output files; enabled preemptible VM usage * similar changes also made to fastq_scan_se task: updated docker image; reduced disk to 50gb and cpu to 1; added set -euo pipefail; added DEBUG statements and cleaned up STDOUT for clear log files; renamed outputs to mention json; removed collection and output of DATE; enabled preemptible VMs * added fastq-scan JSON outputs to read_qc_trim_pe subwf and theiacov_illumina_pe wf * update CI. also removed 'defaults' from conda channels used to install/setup CI env. hopefully that doesn't break everything * update CI for theiaprok_illumina_pe. should pass now * add 2 fastq-scan JSON outputs to theiacov_illumina_SE wf and read_qc_trim_se subwf * add fastq-scan-json outputs to freyja_fastq wf * update theiacov and theiaprok SE CI * added 2 fastq-scan JSON outputs to theiacov_clearlabs wf; added 4 fastq-scan JSON outputs to theiameta_illumina_pe wf; updated read_qc_trim_ont subworkflow description since it was inaccurate * update theiacov_clearlabs CI * added fastq-scan JSON outputs to export_taxon_tables task and added as inputs to theiaprok_illumina_pe and se workflows. need to test in terra * add fastq-scan json outputs to theiaprok_illumina PE and SE wfs * added 4 fastq-scan JSON outputs to theiaeuk_illumina_pe wf * export_taxon_tables task: added set -euo pipefail and removed bug causing typo * update CI * update docs with new JSON outputs for freyja, theiacov wfs, theiaeuk, theiameta, and theiaprok wfs * update CI for snippy_variants task --- .../genomic_characterization/freyja.md | 4 ++ .../genomic_characterization/theiacov.md | 4 ++ .../genomic_characterization/theiaeuk.md | 4 ++ .../genomic_characterization/theiameta.md | 4 ++ .../genomic_characterization/theiaprok.md | 4 ++ .../basic_statistics/task_fastq_scan.wdl | 65 ++++++++++++------- .../data_export/task_broad_terra_tools.wdl | 13 +++- tests/config/environment.yml | 1 - .../theiacov/test_wf_theiacov_clearlabs.yml | 10 ++- .../theiacov/test_wf_theiacov_illumina_pe.yml | 5 +- .../theiacov/test_wf_theiacov_illumina_se.yml | 5 +- .../test_wf_theiaprok_illumina_pe.yml | 16 ++--- .../test_wf_theiaprok_illumina_se.yml | 18 +++-- workflows/freyja/wf_freyja_fastq.wdl | 4 ++ workflows/theiacov/wf_theiacov_clearlabs.wdl | 2 + .../theiacov/wf_theiacov_illumina_pe.wdl | 4 ++ .../theiacov/wf_theiacov_illumina_se.wdl | 2 + .../theiaeuk/wf_theiaeuk_illumina_pe.wdl | 4 ++ .../theiameta/wf_theiameta_illumina_pe.wdl | 4 ++ .../theiaprok/wf_theiaprok_illumina_pe.wdl | 8 +++ .../theiaprok/wf_theiaprok_illumina_se.wdl | 4 ++ workflows/utilities/wf_read_QC_trim_ont.wdl | 2 +- workflows/utilities/wf_read_QC_trim_pe.wdl | 4 ++ workflows/utilities/wf_read_QC_trim_se.wdl | 2 + 24 files changed, 134 insertions(+), 59 deletions(-) diff --git a/docs/workflows/genomic_characterization/freyja.md b/docs/workflows/genomic_characterization/freyja.md index b3e2f4b6f..f93428521 100644 --- a/docs/workflows/genomic_characterization/freyja.md +++ b/docs/workflows/genomic_characterization/freyja.md @@ -327,12 +327,16 @@ The main output file used in subsequent Freyja workflows is found under the `fre | bwa_version | String | Version of BWA used to map read data to the reference genome | PE, SE | | fastp_html_report | File | The HTML report made with fastp | PE, SE | | fastp_version | String | Version of fastp software used | PE, SE | +| fastq_scan_clean1_json | File | JSON file output from `fastq-scan` containing summary stats about clean forward read quality and length | PE, SE | +| fastq_scan_clean2_json | File | JSON file output from `fastq-scan` containing summary stats about clean reverse read quality and length | PE | | fastq_scan_num_reads_clean_pairs | String | Number of clean read pairs | PE | | fastq_scan_num_reads_clean1 | Int | Number of clean forward reads | PE, SE | | fastq_scan_num_reads_clean2 | Int | Number of clean reverse reads | PE | | fastq_scan_num_reads_raw_pairs | String | Number of raw read pairs | PE | | fastq_scan_num_reads_raw1 | Int | Number of raw forward reads | PE, SE | | fastq_scan_num_reads_raw2 | Int | Number of raw reverse reads | PE | +| fastq_scan_raw1_json | File | JSON file output from `fastq-scan` containing summary stats about raw forward read quality and length | PE, SE | +| fastq_scan_raw2_json | File | JSON file output from `fastq-scan` containing summary stats about raw reverse read quality and length | PE | | fastq_scan_version | String | Version of fastq_scan used for read QC analysis | PE, SE | | fastqc_clean1_html | File | Graphical visualization of clean forward read quality from fastqc to open in an internet browser | PE, SE | | fastqc_clean2_html | File | Graphical visualization of clean reverse read quality from fastqc to open in an internet browser | PE | diff --git a/docs/workflows/genomic_characterization/theiacov.md b/docs/workflows/genomic_characterization/theiacov.md index ba7eac3c4..ffe0993f6 100644 --- a/docs/workflows/genomic_characterization/theiacov.md +++ b/docs/workflows/genomic_characterization/theiacov.md @@ -1026,6 +1026,8 @@ All TheiaCoV Workflows (not TheiaCoV_FASTA_Batch) | est_percent_gene_coverage_tsv | File | Percent coverage for each gene in the organism being analyzed (depending on the organism input) | CL, ONT, PE, SE | | fastp_html_report | File | HTML report for fastp | PE, SE | | fastp_version | String | Fastp version used | PE, SE | +| fastq_scan_clean1_json | File | JSON file output from `fastq-scan` containing summary stats about clean forward read quality and length | PE, SE, CL | +| fastq_scan_clean2_json | File | JSON file output from `fastq-scan` containing summary stats about clean reverse read quality and length | PE | | fastq_scan_num_reads_clean_pairs | String | Number of paired reads after filtering as determined by fastq_scan | PE | | fastq_scan_num_reads_clean1 | Int | Number of forward reads after filtering as determined by fastq_scan | CL, PE, SE | | fastq_scan_num_reads_clean2 | Int | Number of reverse reads after filtering as determined by fastq_scan | PE | @@ -1036,6 +1038,8 @@ All TheiaCoV Workflows (not TheiaCoV_FASTA_Batch) | fastq_scan_r1_mean_q_raw | Float | Forward read mean quality value before quality trimming and adapter removal | | | fastq_scan_r1_mean_readlength_clean | Float | Forward read mean read length value after quality trimming and adapter removal | | | fastq_scan_r1_mean_readlength_raw | Float | Forward read mean read length value before quality trimming and adapter removal | | +| fastq_scan_raw1_json | File | JSON file output from `fastq-scan` containing summary stats about raw forward read quality and length | PE, SE, CL | +| fastq_scan_raw2_json | File | JSON file output from `fastq-scan` containing summary stats about raw reverse read quality and length | PE | | fastq_scan_version | String | Version of fastq_scan used for read QC analysis | CL, PE, SE | | fastqc_clean1_html | File | Graphical visualization of clean forward read quality from fastqc to open in an internet browser | PE, SE | | fastqc_clean2_html | File | Graphical visualization of clean reverse read quality from fastqc to open in an internet browser | PE | diff --git a/docs/workflows/genomic_characterization/theiaeuk.md b/docs/workflows/genomic_characterization/theiaeuk.md index 19141cd05..bedeac0cf 100644 --- a/docs/workflows/genomic_characterization/theiaeuk.md +++ b/docs/workflows/genomic_characterization/theiaeuk.md @@ -484,6 +484,10 @@ The TheiaEuk workflow automatically activates taxa-specific tasks after identifi | cg_pipeline_report | File | TSV file of read metrics from raw reads, including average read length, number of reads, and estimated genome coverage | | est_coverage_clean | Float | Estimated coverage calculated from clean reads and genome length | | est_coverage_raw | Float | Estimated coverage calculated from raw reads and genome length | +| fastq_scan_clean1_json | File | JSON file output from `fastq-scan` containing summary stats about clean forward read quality and length | +| fastq_scan_clean2_json | File | JSON file output from `fastq-scan` containing summary stats about clean reverse read quality and length | +| fastq_scan_raw1_json | File | JSON file output from `fastq-scan` containing summary stats about raw forward read quality and length | +| fastq_scan_raw2_json | File | JSON file output from `fastq-scan` containing summary stats about raw reverse read quality and length | | r1_mean_q_clean | Float | Mean quality score of clean forward reads | | r1_mean_q_raw | Float | Mean quality score of raw forward reads | | r2_mean_q_clean | Float | Mean quality score of clean reverse reads | diff --git a/docs/workflows/genomic_characterization/theiameta.md b/docs/workflows/genomic_characterization/theiameta.md index 55c26d9a6..6e9147399 100644 --- a/docs/workflows/genomic_characterization/theiameta.md +++ b/docs/workflows/genomic_characterization/theiameta.md @@ -295,12 +295,16 @@ The TheiaMeta_Illumina_PE workflow processes Illumina paired-end (PE) reads ge | fastp_html_report | File | Report file for fastp in HTML format | | fastp_version | String | Version of fastp used | | fastq_scan_docker | String | Docker image of fastq_scan | +| fastq_scan_clean1_json | File | JSON file output from `fastq-scan` containing summary stats about clean forward read quality and length | +| fastq_scan_clean2_json | File | JSON file output from `fastq-scan` containing summary stats about clean reverse read quality and length | | fastq_scan_num_reads_clean_pairs | String | Number of read pairs after cleaning as calculated by fastq_scan | | fastq_scan_num_reads_clean1 | Int | Number of forward reads after cleaning as calculated by fastq_scan | | fastq_scan_num_reads_clean2 | Int | Number of reverse reads after cleaning as calculated by fastq_scan | | fastq_scan_num_reads_raw_pairs | String | Number of input read pairs as calculated by fastq_scan | | fastq_scan_num_reads_raw1 | Int | Number of input forward reads as calculated by fastq_scan | | fastq_scan_num_reads_raw2 | Int | Number of input reserve reads as calculated by fastq_scan | +| fastq_scan_raw1_json | File | JSON file output from `fastq-scan` containing summary stats about raw forward read quality and length | +| fastq_scan_raw2_json | File | JSON file output from `fastq-scan` containing summary stats about raw reverse read quality and length | | fastq_scan_version | String | fastq_scan version | | fastqc_clean1_html | File | Graphical visualization of clean forward read quality from fastqc to open in an internet browser | | fastqc_clean2_html | File | Graphical visualization of clean reverse read quality from fastqc to open in an internet browser | diff --git a/docs/workflows/genomic_characterization/theiaprok.md b/docs/workflows/genomic_characterization/theiaprok.md index 2b5f5308d..6664df6df 100644 --- a/docs/workflows/genomic_characterization/theiaprok.md +++ b/docs/workflows/genomic_characterization/theiaprok.md @@ -1731,12 +1731,16 @@ The TheiaProk workflows automatically activate taxa-specific sub-workflows after | est_coverage_raw | Float | Estimated coverage calculated from raw reads and genome length | ONT, PE, SE | | fastp_html_report | File | The HTML report made with fastp | PE, SE | | fastp_version | String | Version of fastp software used | PE, SE | +| fastq_scan_clean1_json | File | JSON file output from `fastq-scan` containing summary stats about clean forward read quality and length | PE, SE | +| fastq_scan_clean2_json | File | JSON file output from `fastq-scan` containing summary stats about clean reverse read quality and length | PE | | fastq_scan_num_reads_clean_pairs | String | Number of read pairs after cleaning as calculated by fastq_scan | PE | | fastq_scan_num_reads_clean1 | Int | Number of forward reads after cleaning as calculated by fastq_scan | PE, SE | | fastq_scan_num_reads_clean2 | Int | Number of reverse reads after cleaning as calculated by fastq_scan | PE | | fastq_scan_num_reads_raw_pairs | String | Number of input read pairs calculated by fastq_scan | PE | | fastq_scan_num_reads_raw1 | Int | Number of input forward reads calculated by fastq_scan | PE, SE | | fastq_scan_num_reads_raw2 | Int | Number of input reverse reads calculated by fastq_scan | PE | +| fastq_scan_raw1_json | File | JSON file output from `fastq-scan` containing summary stats about raw forward read quality and length | PE, SE | +| fastq_scan_raw2_json | File | JSON file output from `fastq-scan` containing summary stats about raw reverse read quality and length | PE | | fastq_scan_version | String | Version of fastq-scan software used | PE, SE | | fastqc_clean1_html | File | Graphical visualization of clean forward read quality from fastqc to open in an internet browser | PE, SE | | fastqc_clean2_html | File | Graphical visualization of clean reverse read quality from fastqc to open in an internet browser | PE | diff --git a/tasks/quality_control/basic_statistics/task_fastq_scan.wdl b/tasks/quality_control/basic_statistics/task_fastq_scan.wdl index 029b94917..e2f4a2d4d 100644 --- a/tasks/quality_control/basic_statistics/task_fastq_scan.wdl +++ b/tasks/quality_control/basic_statistics/task_fastq_scan.wdl @@ -6,14 +6,16 @@ task fastq_scan_pe { File read2 String read1_name = basename(basename(basename(read1, ".gz"), ".fastq"), ".fq") String read2_name = basename(basename(basename(read2, ".gz"), ".fastq"), ".fq") - Int disk_size = 100 - String docker = "quay.io/biocontainers/fastq-scan:0.4.4--h7d875b9_1" + Int disk_size = 50 + String docker = "us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3" Int memory = 2 - Int cpu = 2 + Int cpu = 1 } command <<< - # capture date and version - date | tee DATE + # exit task in case anything fails in one-liners or variables are unset + set -euo pipefail + + # capture version fastq-scan -v | tee VERSION # set cat command based on compression @@ -24,11 +26,21 @@ task fastq_scan_pe { fi # capture forward read stats + echo "DEBUG: running fastq-scan on $(basename ~{read1})" eval "${cat_reads} ~{read1}" | fastq-scan | tee ~{read1_name}_fastq-scan.json - cat ~{read1_name}_fastq-scan.json | jq .qc_stats.read_total | tee READ1_SEQS + # using simple redirect so STDOUT is not confusing + jq .qc_stats.read_total ~{read1_name}_fastq-scan.json > READ1_SEQS + echo "DEBUG: number of reads in $(basename ~{read1}): $(cat READ1_SEQS)" read1_seqs=$(cat READ1_SEQS) + echo + + # capture reverse read stats + echo "DEBUG: running fastq-scan on $(basename ~{read2})" eval "${cat_reads} ~{read2}" | fastq-scan | tee ~{read2_name}_fastq-scan.json - cat ~{read2_name}_fastq-scan.json | jq .qc_stats.read_total | tee READ2_SEQS + + # using simple redirect so STDOUT is not confusing + jq .qc_stats.read_total ~{read2_name}_fastq-scan.json > READ2_SEQS + echo "DEBUG: number of reads in $(basename ~{read2}): $(cat READ2_SEQS)" read2_seqs=$(cat READ2_SEQS) # capture number of read pairs @@ -37,17 +49,18 @@ task fastq_scan_pe { else read_pairs="Uneven pairs: R1=${read1_seqs}, R2=${read2_seqs}" fi - - echo $read_pairs | tee READ_PAIRS + + # use simple redirect so STDOUT is not confusing + echo "$read_pairs" > READ_PAIRS + echo "DEBUG: number of read pairs: $(cat READ_PAIRS)" >>> output { - File read1_fastq_scan_report = "~{read1_name}_fastq-scan.json" - File read2_fastq_scan_report = "~{read2_name}_fastq-scan.json" + File read1_fastq_scan_json = "~{read1_name}_fastq-scan.json" + File read2_fastq_scan_json = "~{read2_name}_fastq-scan.json" Int read1_seq = read_int("READ1_SEQS") Int read2_seq = read_int("READ2_SEQS") String read_pairs = read_string("READ_PAIRS") String version = read_string("VERSION") - String pipeline_date = read_string("DATE") String fastq_scan_docker = docker } runtime { @@ -55,8 +68,8 @@ task fastq_scan_pe { memory: memory + " GB" cpu: cpu disks: "local-disk " + disk_size + " SSD" - disk: disk_size + " GB" # TES - preemptible: 0 + disk: disk_size + " GB" + preemptible: 1 maxRetries: 3 } } @@ -65,14 +78,16 @@ task fastq_scan_se { input { File read1 String read1_name = basename(basename(basename(read1, ".gz"), ".fastq"), ".fq") - Int disk_size = 100 + Int disk_size = 50 Int memory = 2 - Int cpu = 2 - String docker = "quay.io/biocontainers/fastq-scan:0.4.4--h7d875b9_1" + Int cpu = 1 + String docker = "us-docker.pkg.dev/general-theiagen/biocontainers/fastq-scan:1.0.1--h4ac6f70_3" } command <<< - # capture date and version - date | tee DATE + # exit task in case anything fails in one-liners or variables are unset + set -euo pipefail + + # capture version fastq-scan -v | tee VERSION # set cat command based on compression @@ -83,14 +98,16 @@ task fastq_scan_se { fi # capture forward read stats + echo "DEBUG: running fastq-scan on $(basename ~{read1})" eval "${cat_reads} ~{read1}" | fastq-scan | tee ~{read1_name}_fastq-scan.json - cat ~{read1_name}_fastq-scan.json | jq .qc_stats.read_total | tee READ1_SEQS + # using simple redirect so STDOUT is not confusing + jq .qc_stats.read_total ~{read1_name}_fastq-scan.json > READ1_SEQS + echo "DEBUG: number of reads in $(basename ~{read1}): $(cat READ1_SEQS)" >>> output { - File fastq_scan_report = "~{read1_name}_fastq-scan.json" + File fastq_scan_json = "~{read1_name}_fastq-scan.json" Int read1_seq = read_int("READ1_SEQS") String version = read_string("VERSION") - String pipeline_date = read_string("DATE") String fastq_scan_docker = docker } runtime { @@ -98,8 +115,8 @@ task fastq_scan_se { memory: memory + " GB" cpu: cpu disks: "local-disk " + disk_size + " SSD" - disk: disk_size + " GB" # TES - preemptible: 0 + disk: disk_size + " GB" + preemptible: 1 maxRetries: 3 } } diff --git a/tasks/utilities/data_export/task_broad_terra_tools.wdl b/tasks/utilities/data_export/task_broad_terra_tools.wdl index 3a3fba0fd..8a54d6bad 100644 --- a/tasks/utilities/data_export/task_broad_terra_tools.wdl +++ b/tasks/utilities/data_export/task_broad_terra_tools.wdl @@ -35,6 +35,10 @@ task export_taxon_tables { Int? num_reads_raw2 String? num_reads_raw_pairs String? fastq_scan_version + File? fastq_scan_raw1_json + File? fastq_scan_raw2_json + File? fastq_scan_clean1_json + File? fastq_scan_clean2_json Int? num_reads_clean1 Int? num_reads_clean2 String? num_reads_clean_pairs @@ -390,7 +394,8 @@ task export_taxon_tables { volatile: true } command <<< - + set -euo pipefail + # capture taxon and corresponding table names from input taxon_tables taxon_array=($(cut -f1 ~{taxon_tables} | tail +2)) echo "Taxon array: ${taxon_array[*]}" @@ -446,6 +451,10 @@ task export_taxon_tables { "num_reads_raw2": "~{num_reads_raw2}", "num_reads_raw_pairs": "~{num_reads_raw_pairs}", "fastq_scan_version": "~{fastq_scan_version}", + "fastq_scan_raw1_json": "~{fastq_scan_raw1_json}", + "fastq_scan_raw2_json": "~{fastq_scan_raw2_json}", + "fastq_scan_clean1_json": "~{fastq_scan_clean1_json}", + "fastq_scan_clean2_json": "~{fastq_scan_clean2_json}", "num_reads_clean1": "~{num_reads_clean1}", "num_reads_clean2": "~{num_reads_clean2}", "num_reads_clean_pairs": "~{num_reads_clean_pairs}", @@ -778,7 +787,7 @@ task export_taxon_tables { "agrvate_version": "~{agrvate_version}", "agrvate_docker": "~{agrvate_docker}", "srst2_vibrio_detailed_tsv": "~{srst2_vibrio_detailed_tsv}", - "srst2_vibrio_version": "~{srst2_vibrio_version}",~ + "srst2_vibrio_version": "~{srst2_vibrio_version}", "srst2_vibrio_docker": "~{srst2_vibrio_docker}", "srst2_vibrio_database": "~{srst2_vibrio_database}", "srst2_vibrio_ctxA": "~{srst2_vibrio_ctxA}", diff --git a/tests/config/environment.yml b/tests/config/environment.yml index 0aed07151..c4016d3ae 100644 --- a/tests/config/environment.yml +++ b/tests/config/environment.yml @@ -2,7 +2,6 @@ name: pytest-env-CI channels: - conda-forge - bioconda - - defaults dependencies: - python >=3.7 - cromwell=86 diff --git a/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml b/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml index fe20263a2..a7108f67f 100644 --- a/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml +++ b/tests/workflows/theiacov/test_wf_theiacov_clearlabs.yml @@ -115,17 +115,16 @@ - path: miniwdl_run/call-fastq_scan_clean_reads/inputs.json contains: ["read1", "clearlabs"] - path: miniwdl_run/call-fastq_scan_clean_reads/outputs.json - contains: ["fastq_scan_se", "pipeline_date", "read1_seq"] + contains: ["fastq_scan_se", "read1_seq"] - path: miniwdl_run/call-fastq_scan_clean_reads/stderr.txt - path: miniwdl_run/call-fastq_scan_clean_reads/stderr.txt.offset - path: miniwdl_run/call-fastq_scan_clean_reads/stdout.txt - path: miniwdl_run/call-fastq_scan_clean_reads/task.log contains: ["wdl", "theiacov_clearlabs", "fastq_scan_clean_reads", "done"] - - path: miniwdl_run/call-fastq_scan_clean_reads/work/DATE - path: miniwdl_run/call-fastq_scan_clean_reads/work/READ1_SEQS md5sum: 097e79b36919c8377c56088363e3d8b7 - path: miniwdl_run/call-fastq_scan_clean_reads/work/VERSION - md5sum: 8e4e9cdfbacc9021a3175ccbbbde002b + md5sum: a59bb42644e35c09b8fa8087156fa4c2 - path: miniwdl_run/call-fastq_scan_clean_reads/work/_miniwdl_inputs/0/clearlabs_R1_dehosted.fastq.gz - path: miniwdl_run/call-fastq_scan_clean_reads/work/clearlabs_R1_dehosted_fastq-scan.json md5sum: 869dd2e934c600bba35f30f08e2da7c9 @@ -134,17 +133,16 @@ - path: miniwdl_run/call-fastq_scan_raw_reads/inputs.json contains: ["read1", "clearlabs"] - path: miniwdl_run/call-fastq_scan_raw_reads/outputs.json - contains: ["fastq_scan_se", "pipeline_date", "read1_seq"] + contains: ["fastq_scan_se", "read1_seq"] - path: miniwdl_run/call-fastq_scan_raw_reads/stderr.txt - path: miniwdl_run/call-fastq_scan_raw_reads/stderr.txt.offset - path: miniwdl_run/call-fastq_scan_raw_reads/stdout.txt - path: miniwdl_run/call-fastq_scan_raw_reads/task.log contains: ["wdl", "theiacov_clearlabs", "fastq_scan_raw_reads", "done"] - - path: miniwdl_run/call-fastq_scan_raw_reads/work/DATE - path: miniwdl_run/call-fastq_scan_raw_reads/work/READ1_SEQS md5sum: 097e79b36919c8377c56088363e3d8b7 - path: miniwdl_run/call-fastq_scan_raw_reads/work/VERSION - md5sum: 8e4e9cdfbacc9021a3175ccbbbde002b + md5sum: a59bb42644e35c09b8fa8087156fa4c2 - path: miniwdl_run/call-fastq_scan_raw_reads/work/_miniwdl_inputs/0/clearlabs.fastq.gz - path: miniwdl_run/call-fastq_scan_raw_reads/work/clearlabs_fastq-scan.json md5sum: 869dd2e934c600bba35f30f08e2da7c9 diff --git a/tests/workflows/theiacov/test_wf_theiacov_illumina_pe.yml b/tests/workflows/theiacov/test_wf_theiacov_illumina_pe.yml index dfef9c994..1ff0b33b8 100644 --- a/tests/workflows/theiacov/test_wf_theiacov_illumina_pe.yml +++ b/tests/workflows/theiacov/test_wf_theiacov_illumina_pe.yml @@ -60,11 +60,11 @@ md5sum: d41d8cd98f00b204e9800998ecf8427e # fastq scan raw - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/command - md5sum: 9b2cc0107f1a90972482d7b3a658d242 + md5sum: 56bcc1ba5d2a9c94f4704fc4b8e6b7ba - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/inputs.json contains: ["read1", "read2", "illumina_pe"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/outputs.json - contains: ["fastq_scan_pe", "pipeline_date", "read1_seq", "read2_seq"] + contains: ["fastq_scan_pe", "read1_seq", "read2_seq"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt.offset - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stdout.txt @@ -74,7 +74,6 @@ md5sum: 2a77387b247176aa5fcc9aed228699c9 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/SRR13687078_2_fastq-scan.json md5sum: d0eebdd4e14cf0a0b371fee1338474c9 - - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/DATE - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/READ1_SEQS md5sum: 4e4a08422dbf7001fd09ad5126e13b44 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/READ2_SEQS diff --git a/tests/workflows/theiacov/test_wf_theiacov_illumina_se.yml b/tests/workflows/theiacov/test_wf_theiacov_illumina_se.yml index 22453bdd5..99668f641 100644 --- a/tests/workflows/theiacov/test_wf_theiacov_illumina_se.yml +++ b/tests/workflows/theiacov/test_wf_theiacov_illumina_se.yml @@ -56,11 +56,11 @@ md5sum: d41d8cd98f00b204e9800998ecf8427e # fastq scan raw - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/command - md5sum: 56f66a4ef82d3ae03c17db6a26f59528 + md5sum: f96c3103490fff3560fc930a84bd459d - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/inputs.json contains: ["read1", "illumina_se"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/outputs.json - contains: ["fastq_scan_se", "pipeline_date", "read1_seq"] + contains: ["fastq_scan_se", "read1_seq"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt.offset - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stdout.txt @@ -68,7 +68,6 @@ contains: ["wdl", "theiacov_illumina_se", "fastq_scan_raw", "done"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/ERR6319327_fastq-scan.json md5sum: 66b2f7c60b74de654f590d77bdd2231e - - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/DATE - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/READ1_SEQS md5sum: 87f1a9ed69127009aa0c173cd74c9d31 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/VERSION diff --git a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml index 71f5bd4a2..b33428777 100644 --- a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml +++ b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml @@ -416,14 +416,13 @@ - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/inputs.json contains: ["read", "fastq"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/outputs.json - contains: ["read", "fastq", "fastq_scan_report"] + contains: ["read", "fastq", "fastq_scan_json"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/stderr.txt - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/stderr.txt.offset - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/stdout.txt contains: ["fastq", "qc_stats", "read_lengths"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/task.log contains: ["wdl", "theiaprok_illumina_pe", "fastq_scan_clean", "done"] - - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/DATE - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/READ1_SEQS md5sum: 5fcafec683df465a99878ceaffe8a294 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/READ2_SEQS @@ -431,7 +430,7 @@ - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/READ_PAIRS md5sum: 5fcafec683df465a99878ceaffe8a294 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/VERSION - md5sum: 8e4e9cdfbacc9021a3175ccbbbde002b + md5sum: a59bb42644e35c09b8fa8087156fa4c2 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/_miniwdl_inputs/0/test_1.clean.fastq.gz - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/_miniwdl_inputs/0/test_2.clean.fastq.gz - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/test_1.clean_fastq-scan.json @@ -443,14 +442,13 @@ - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/inputs.json contains: ["read", "fastq"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/outputs.json - contains: ["read", "fastq", "fastq_scan_report"] + contains: ["read", "fastq", "fastq_scan_json"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt.offset - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stdout.txt contains: ["fastq", "qc_stats", "read_lengths"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/task.log contains: ["wdl", "theiaprok_illumina_pe", "fastq_scan_raw", "done"] - - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/DATE - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/READ1_SEQS md5sum: 75fa2f47fecb5dec8d244366881e76ec - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/READ2_SEQS @@ -462,7 +460,7 @@ - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/SRR2838702_R2_fastq-scan.json md5sum: e81f34050c11995771de79182f06d793 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/VERSION - md5sum: 8e4e9cdfbacc9021a3175ccbbbde002b + md5sum: a59bb42644e35c09b8fa8087156fa4c2 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/_miniwdl_inputs/0/SRR2838702_R1.fastq.gz - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/_miniwdl_inputs/0/SRR2838702_R2.fastq.gz - path: miniwdl_run/call-read_QC_trim/call-trimmomatic_pe/command @@ -561,7 +559,7 @@ - path: miniwdl_run/wdl/tasks/gene_typing/drug_resistance/task_resfinder.wdl md5sum: 27528633723303b462d095b642649453 - path: miniwdl_run/wdl/tasks/gene_typing/variant_detection/task_snippy_variants.wdl - md5sum: 3b9e04569d7e856dcc649b7726b306b7 + md5sum: 440a620a10ccdafe612f0b33ef05f86d - path: miniwdl_run/wdl/tasks/quality_control/read_filtering/task_bbduk.wdl md5sum: aec6ef024d6dff31723f44290f6b9cf5 - path: miniwdl_run/wdl/tasks/quality_control/advanced_metrics/task_busco.wdl @@ -629,9 +627,9 @@ - path: miniwdl_run/wdl/tasks/taxon_id/contamination/task_midas.wdl md5sum: 64caaaff5910ac0036e2659434500962 - path: miniwdl_run/wdl/tasks/utilities/data_export/task_broad_terra_tools.wdl - md5sum: 4d69a6539b68503af9f3f1c2787ff920 + md5sum: 8c97c5bd65e2787239f12ef425d479ae - path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl - md5sum: 3cb5c86b15e931b0c0b98ed784386438 + md5sum: d8db687487a45536d4837a540ed2a135 - path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl md5sum: ea5cff6eff8c2c42046cf2eae6f16b6f - path: miniwdl_run/wdl/workflows/utilities/wf_read_QC_trim_pe.wdl diff --git a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml index 88584182b..60a1a2fa0 100644 --- a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml +++ b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml @@ -400,18 +400,17 @@ - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/inputs.json contains: ["read", "fastq"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/outputs.json - contains: ["read", "fastq", "fastq_scan_report"] + contains: ["read", "fastq", "fastq_scan_json"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/stderr.txt - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/stderr.txt.offset - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/stdout.txt contains: ["fastq", "qc_stats", "read_lengths"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/task.log contains: ["wdl", "theiaprok_illumina_se", "fastq_scan_clean", "done"] - - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/DATE - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/READ1_SEQS md5sum: 499f7af0d267a13f5523ec9a60ec46e3 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/VERSION - md5sum: 8e4e9cdfbacc9021a3175ccbbbde002b + md5sum: a59bb42644e35c09b8fa8087156fa4c2 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/_miniwdl_inputs/0/test_1.clean.fastq.gz - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_clean/work/test_1.clean_fastq-scan.json md5sum: eb30273b3f19578fec5360da8b255e28 @@ -420,20 +419,19 @@ - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/inputs.json contains: ["read", "fastq"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/outputs.json - contains: ["read", "fastq", "fastq_scan_report"] + contains: ["read", "fastq", "fastq_scan_json"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stderr.txt.offset - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/stdout.txt contains: ["fastq", "qc_stats", "read_lengths"] - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/task.log contains: ["wdl", "theiaprok_illumina_se", "fastq_scan_raw", "done"] - - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/DATE - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/READ1_SEQS md5sum: 75fa2f47fecb5dec8d244366881e76ec - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/SRR2838702_R1_fastq-scan.json md5sum: c4a64c8fd27fa357206e0d41b74866e2 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/VERSION - md5sum: 8e4e9cdfbacc9021a3175ccbbbde002b + md5sum: a59bb42644e35c09b8fa8087156fa4c2 - path: miniwdl_run/call-read_QC_trim/call-fastq_scan_raw/work/_miniwdl_inputs/0/SRR2838702_R1.fastq.gz - path: miniwdl_run/call-read_QC_trim/call-trimmomatic_se/command md5sum: a317f1a2182fe1a3b26812b54eff088e @@ -526,7 +524,7 @@ - path: miniwdl_run/wdl/tasks/gene_typing/drug_resistance/task_resfinder.wdl md5sum: 27528633723303b462d095b642649453 - path: miniwdl_run/wdl/tasks/gene_typing/variant_detection/task_snippy_variants.wdl - md5sum: 3b9e04569d7e856dcc649b7726b306b7 + md5sum: 440a620a10ccdafe612f0b33ef05f86d - path: miniwdl_run/wdl/tasks/quality_control/read_filtering/task_bbduk.wdl md5sum: aec6ef024d6dff31723f44290f6b9cf5 - path: miniwdl_run/wdl/tasks/quality_control/advanced_metrics/task_busco.wdl @@ -592,12 +590,12 @@ - path: miniwdl_run/wdl/tasks/taxon_id/contamination/task_midas.wdl md5sum: 64caaaff5910ac0036e2659434500962 - path: miniwdl_run/wdl/tasks/utilities/data_export/task_broad_terra_tools.wdl - md5sum: 4d69a6539b68503af9f3f1c2787ff920 + md5sum: 8c97c5bd65e2787239f12ef425d479ae - path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_se.wdl - md5sum: fdb66b59ac886501a4ae90a25cefd633 + md5sum: 4111a758490174325ae8ea52a95319e9 - path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl md5sum: ea5cff6eff8c2c42046cf2eae6f16b6f - path: miniwdl_run/wdl/workflows/utilities/wf_read_QC_trim_se.wdl - md5sum: d11bfe33fdd96eab28892be5a01c1c7d + md5sum: a7ef5a7a38dd60ff2edf699ae6808ebb - path: miniwdl_run/workflow.log contains: ["wdl", "theiaprok_illumina_se", "NOTICE", "done"] diff --git a/workflows/freyja/wf_freyja_fastq.wdl b/workflows/freyja/wf_freyja_fastq.wdl index 9a4446bf8..7b46a204c 100644 --- a/workflows/freyja/wf_freyja_fastq.wdl +++ b/workflows/freyja/wf_freyja_fastq.wdl @@ -116,10 +116,14 @@ workflow freyja_fastq { String fastq_scan_num_reads_raw1 = select_first([read_QC_trim_pe.fastq_scan_raw1, read_QC_trim_se.fastq_scan_raw1, ""]) Int? fastq_scan_num_reads_raw2 = read_QC_trim_pe.fastq_scan_raw2 String? fastq_scan_num_reads_raw_pairs = read_QC_trim_pe.fastq_scan_raw_pairs + String fastq_scan_raw1_json = select_first([read_QC_trim_pe.fastq_scan_raw1_json, read_QC_trim_se.fastq_scan_raw1_json, ""]) + File? fastq_scan_raw2_json = read_QC_trim_pe.fastq_scan_raw2_json String fastq_scan_version = select_first([read_QC_trim_pe.fastq_scan_version, read_QC_trim_se.fastq_scan_version, ""]) String fastq_scan_num_reads_clean1 = select_first([read_QC_trim_pe.fastq_scan_clean1, read_QC_trim_se.fastq_scan_clean1, ""]) Int? fastq_scan_num_reads_clean2 = read_QC_trim_pe.fastq_scan_clean2 String? fastq_scan_num_reads_clean_pairs = read_QC_trim_pe.fastq_scan_clean_pairs + String fastq_scan_clean1_json = select_first([read_QC_trim_pe.fastq_scan_clean1_json, read_QC_trim_se.fastq_scan_clean1_json, ""]) + File? fastq_scan_clean2_json = read_QC_trim_pe.fastq_scan_clean2_json # Read QC - fastqc outputs - Illumina PE and SE String fastqc_num_reads_raw1 = select_first([read_QC_trim_pe.fastqc_raw1, read_QC_trim_se.fastqc_raw1, ""]) Int? fastqc_num_reads_raw2 = read_QC_trim_pe.fastqc_raw2 diff --git a/workflows/theiacov/wf_theiacov_clearlabs.wdl b/workflows/theiacov/wf_theiacov_clearlabs.wdl index 0368bef95..5774e02f7 100644 --- a/workflows/theiacov/wf_theiacov_clearlabs.wdl +++ b/workflows/theiacov/wf_theiacov_clearlabs.wdl @@ -171,6 +171,8 @@ workflow theiacov_clearlabs { Int fastq_scan_num_reads_raw1 = fastq_scan_raw_reads.read1_seq Int fastq_scan_num_reads_clean1 = fastq_scan_clean_reads.read1_seq String fastq_scan_version = fastq_scan_raw_reads.version + File fastq_scan_raw1_json = fastq_scan_raw_reads.fastq_scan_json + File fastq_scan_clean1_json = fastq_scan_clean_reads.fastq_scan_json # Read QC - kraken outputs String kraken_version = kraken2_raw.version Float kraken_human = kraken2_raw.percent_human diff --git a/workflows/theiacov/wf_theiacov_illumina_pe.wdl b/workflows/theiacov/wf_theiacov_illumina_pe.wdl index 5f5e5b651..29585659e 100644 --- a/workflows/theiacov/wf_theiacov_illumina_pe.wdl +++ b/workflows/theiacov/wf_theiacov_illumina_pe.wdl @@ -260,6 +260,10 @@ workflow theiacov_illumina_pe { Int? fastq_scan_num_reads_clean1 = read_QC_trim.fastq_scan_clean1 Int? fastq_scan_num_reads_clean2 = read_QC_trim.fastq_scan_clean2 String? fastq_scan_num_reads_clean_pairs = read_QC_trim.fastq_scan_clean_pairs + File? fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json + File? fastq_scan_raw2_json = read_QC_trim.fastq_scan_raw2_json + File? fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json + File? fastq_scan_clean2_json = read_QC_trim.fastq_scan_clean2_json # Read QC - fastqc outputs Int? fastqc_num_reads_raw1 = read_QC_trim.fastqc_raw1 Int? fastqc_num_reads_raw2 = read_QC_trim.fastqc_raw2 diff --git a/workflows/theiacov/wf_theiacov_illumina_se.wdl b/workflows/theiacov/wf_theiacov_illumina_se.wdl index 4ae59f5dc..0de516664 100644 --- a/workflows/theiacov/wf_theiacov_illumina_se.wdl +++ b/workflows/theiacov/wf_theiacov_illumina_se.wdl @@ -215,6 +215,8 @@ workflow theiacov_illumina_se { Int? fastq_scan_num_reads_raw1 = read_QC_trim.fastq_scan_raw1 String? fastq_scan_version = read_QC_trim.fastq_scan_version Int? fastq_scan_num_reads_clean1 = read_QC_trim.fastq_scan_clean1 + File? fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json + File? fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json # Read QC - fastqc outputs Int? fastqc_num_reads_raw1 = read_QC_trim.fastqc_raw1 Int? fastqc_num_reads_clean1 = read_QC_trim.fastqc_clean1 diff --git a/workflows/theiaeuk/wf_theiaeuk_illumina_pe.wdl b/workflows/theiaeuk/wf_theiaeuk_illumina_pe.wdl index 67c5c5464..3e792345d 100644 --- a/workflows/theiaeuk/wf_theiaeuk_illumina_pe.wdl +++ b/workflows/theiaeuk/wf_theiaeuk_illumina_pe.wdl @@ -208,6 +208,10 @@ workflow theiaeuk_illumina_pe { Int? fastq_scan_num_reads_clean1 = read_QC_trim.fastq_scan_clean1 Int? fastq_scan_num_reads_clean2 = read_QC_trim.fastq_scan_clean2 String? fastq_scan_num_reads_clean_pairs = read_QC_trim.fastq_scan_clean_pairs + File? fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json + File? fastq_scan_raw2_json = read_QC_trim.fastq_scan_raw2_json + File? fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json + File? fastq_scan_clean2_json = read_QC_trim.fastq_scan_clean2_json # Read QC - trimmomatic outputs String? trimmomatic_version = read_QC_trim.trimmomatic_version String? trimmomatic_docker = read_QC_trim.trimmomatic_docker diff --git a/workflows/theiameta/wf_theiameta_illumina_pe.wdl b/workflows/theiameta/wf_theiameta_illumina_pe.wdl index 51f1a0054..2a6a23488 100644 --- a/workflows/theiameta/wf_theiameta_illumina_pe.wdl +++ b/workflows/theiameta/wf_theiameta_illumina_pe.wdl @@ -207,6 +207,10 @@ workflow theiameta_illumina_pe { Int? fastq_scan_num_reads_clean1 = read_QC_trim.fastq_scan_clean1 Int? fastq_scan_num_reads_clean2 = read_QC_trim.fastq_scan_clean2 String? fastq_scan_num_reads_clean_pairs = read_QC_trim.fastq_scan_clean_pairs + File? fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json + File? fastq_scan_raw2_json = read_QC_trim.fastq_scan_raw2_json + File? fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json + File? fastq_scan_clean2_json = read_QC_trim.fastq_scan_clean2_json # Read QC - fastqc outputs Int? fastqc_num_reads_raw1 = read_QC_trim.fastqc_raw1 Int? fastqc_num_reads_raw2 = read_QC_trim.fastqc_raw2 diff --git a/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl b/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl index d71c5e324..32271224e 100644 --- a/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl +++ b/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl @@ -277,6 +277,10 @@ workflow theiaprok_illumina_pe { num_reads_clean1 = read_QC_trim.fastq_scan_clean1, num_reads_clean2 = read_QC_trim.fastq_scan_clean2, num_reads_clean_pairs = read_QC_trim.fastq_scan_clean_pairs, + fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json, + fastq_scan_raw2_json = read_QC_trim.fastq_scan_raw2_json, + fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json, + fastq_scan_clean2_json = read_QC_trim.fastq_scan_clean2_json, trimmomatic_version = read_QC_trim.trimmomatic_version, fastp_version = read_QC_trim.fastp_version, bbduk_docker = read_QC_trim.bbduk_docker, @@ -615,6 +619,10 @@ workflow theiaprok_illumina_pe { Int? fastq_scan_num_reads_clean1 = read_QC_trim.fastq_scan_clean1 Int? fastq_scan_num_reads_clean2 = read_QC_trim.fastq_scan_clean2 String? fastq_scan_num_reads_clean_pairs = read_QC_trim.fastq_scan_clean_pairs + File? fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json + File? fastq_scan_raw2_json = read_QC_trim.fastq_scan_raw2_json + File? fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json + File? fastq_scan_clean2_json = read_QC_trim.fastq_scan_clean2_json # Read QC - fastqc outputs Int? fastqc_num_reads_raw1 = read_QC_trim.fastqc_raw1 Int? fastqc_num_reads_raw2 = read_QC_trim.fastqc_raw2 diff --git a/workflows/theiaprok/wf_theiaprok_illumina_se.wdl b/workflows/theiaprok/wf_theiaprok_illumina_se.wdl index 1c3eee081..e743ecbce 100644 --- a/workflows/theiaprok/wf_theiaprok_illumina_se.wdl +++ b/workflows/theiaprok/wf_theiaprok_illumina_se.wdl @@ -254,6 +254,8 @@ workflow theiaprok_illumina_se { num_reads_raw1 = read_QC_trim.fastq_scan_raw1, fastq_scan_version = read_QC_trim.fastq_scan_version, num_reads_clean1 = read_QC_trim.fastq_scan_clean1, + fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json, + fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json, trimmomatic_version = read_QC_trim.trimmomatic_version, fastp_version = read_QC_trim.fastp_version, bbduk_docker = read_QC_trim.bbduk_docker, @@ -571,6 +573,8 @@ workflow theiaprok_illumina_se { Int? fastq_scan_num_reads_raw1 = read_QC_trim.fastq_scan_raw1 String? fastq_scan_version = read_QC_trim.fastq_scan_version Int? fastq_scan_num_reads_clean1 = read_QC_trim.fastq_scan_clean1 + File? fastq_scan_raw1_json = read_QC_trim.fastq_scan_raw1_json + File? fastq_scan_clean1_json = read_QC_trim.fastq_scan_clean1_json # Read QC - fastqc outputs Int? fastqc_num_reads_raw1 = read_QC_trim.fastqc_raw1 Int? fastqc_num_reads_clean1 = read_QC_trim.fastqc_clean1 diff --git a/workflows/utilities/wf_read_QC_trim_ont.wdl b/workflows/utilities/wf_read_QC_trim_ont.wdl index c03141251..5b84562aa 100644 --- a/workflows/utilities/wf_read_QC_trim_ont.wdl +++ b/workflows/utilities/wf_read_QC_trim_ont.wdl @@ -9,7 +9,7 @@ import "../../tasks/utilities/task_rasusa.wdl" as rasusa_task workflow read_QC_trim_ont { meta { - description: "Runs basic QC on Oxford Nanopore (ONT) reads with (1) fastq_scan, (2) nanoplot, (3) rasusa downsampling, (4) tiptoft plasmid detection, and (5) nanoq filtering" + description: "Runs basic QC on Oxford Nanopore (ONT) reads with nanoplot, rasusa downsampling, tiptoft plasmid detection, and nanoq filtering" } input { String samplename diff --git a/workflows/utilities/wf_read_QC_trim_pe.wdl b/workflows/utilities/wf_read_QC_trim_pe.wdl index bb79c260d..0d6090036 100644 --- a/workflows/utilities/wf_read_QC_trim_pe.wdl +++ b/workflows/utilities/wf_read_QC_trim_pe.wdl @@ -174,6 +174,10 @@ workflow read_QC_trim_pe { String? fastq_scan_clean_pairs = fastq_scan_clean.read_pairs String? fastq_scan_version = fastq_scan_raw.version String? fastq_scan_docker = fastq_scan_raw.fastq_scan_docker + File? fastq_scan_raw1_json = fastq_scan_raw.read1_fastq_scan_json + File? fastq_scan_raw2_json = fastq_scan_raw.read2_fastq_scan_json + File? fastq_scan_clean1_json = fastq_scan_clean.read1_fastq_scan_json + File? fastq_scan_clean2_json = fastq_scan_clean.read2_fastq_scan_json # fastqc Int? fastqc_raw1 = fastqc_raw.read1_seq diff --git a/workflows/utilities/wf_read_QC_trim_se.wdl b/workflows/utilities/wf_read_QC_trim_se.wdl index d652014ce..af147b512 100644 --- a/workflows/utilities/wf_read_QC_trim_se.wdl +++ b/workflows/utilities/wf_read_QC_trim_se.wdl @@ -149,6 +149,8 @@ workflow read_QC_trim_se { Int? fastq_scan_clean1 = fastq_scan_clean.read1_seq String? fastq_scan_version = fastq_scan_raw.version String? fastq_scan_docker = fastq_scan_raw.fastq_scan_docker + File? fastq_scan_raw1_json = fastq_scan_raw.fastq_scan_json + File? fastq_scan_clean1_json = fastq_scan_clean.fastq_scan_json # fastqc Int? fastqc_raw1 = fastqc_raw.read1_seq From 2669f994a90dc2c6d6703eef14cf3c639a7bfc1c Mon Sep 17 00:00:00 2001 From: Sage Wright Date: Fri, 8 Nov 2024 15:35:08 -0500 Subject: [PATCH 2/2] Prevent Silent Errors (#666) * tada * two more * update --- tasks/utilities/data_export/task_export_two_tsvs.wdl | 1 + tasks/utilities/data_handling/task_summarize_data.wdl | 2 ++ tasks/utilities/data_handling/task_theiacov_fasta_batch.wdl | 2 ++ tasks/utilities/data_import/task_create_terra_table.wdl | 4 ++++ tasks/utilities/file_handling/task_transfer_files.wdl | 2 ++ tasks/utilities/submission/task_submission.wdl | 2 ++ tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml | 2 +- tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml | 2 +- 8 files changed, 15 insertions(+), 2 deletions(-) diff --git a/tasks/utilities/data_export/task_export_two_tsvs.wdl b/tasks/utilities/data_export/task_export_two_tsvs.wdl index d3707441f..4410e29a8 100644 --- a/tasks/utilities/data_export/task_export_two_tsvs.wdl +++ b/tasks/utilities/data_export/task_export_two_tsvs.wdl @@ -18,6 +18,7 @@ task export_two_tsvs { volatile: true } command <<< + set -euo pipefail python3 /scripts/export_large_tsv/export_large_tsv.py --project ~{terra_project1} --workspace ~{terra_workspace1} --entity_type ~{datatable1} --tsv_filename "~{datatable1}_table1.tsv" # check if second project is provided; if not, use first diff --git a/tasks/utilities/data_handling/task_summarize_data.wdl b/tasks/utilities/data_handling/task_summarize_data.wdl index 40586fbf3..5e5f64468 100644 --- a/tasks/utilities/data_handling/task_summarize_data.wdl +++ b/tasks/utilities/data_handling/task_summarize_data.wdl @@ -23,6 +23,8 @@ task summarize_data { volatile: true } command <<< + set -euo pipefail + # when running on terra, comment out all input_table mentions python3 /scripts/export_large_tsv/export_large_tsv.py --project "~{terra_project}" --workspace "~{terra_workspace}" --entity_type ~{terra_table} --tsv_filename ~{terra_table}-data.tsv diff --git a/tasks/utilities/data_handling/task_theiacov_fasta_batch.wdl b/tasks/utilities/data_handling/task_theiacov_fasta_batch.wdl index 5ab9247ad..4eb101b2e 100644 --- a/tasks/utilities/data_handling/task_theiacov_fasta_batch.wdl +++ b/tasks/utilities/data_handling/task_theiacov_fasta_batch.wdl @@ -28,6 +28,8 @@ task sm_theiacov_fasta_wrangling { # the sm stands for supermassive Int memory = 4 } command <<< + set -euo pipefail + # check if nextclade json file exists if [ -f ~{nextclade_json} ]; then # this line splits into individual json files diff --git a/tasks/utilities/data_import/task_create_terra_table.wdl b/tasks/utilities/data_import/task_create_terra_table.wdl index 638052ab0..22f95453a 100644 --- a/tasks/utilities/data_import/task_create_terra_table.wdl +++ b/tasks/utilities/data_import/task_create_terra_table.wdl @@ -146,6 +146,10 @@ task create_terra_table { done >> output { diff --git a/tasks/utilities/file_handling/task_transfer_files.wdl b/tasks/utilities/file_handling/task_transfer_files.wdl index 28cfbebb9..1115df119 100644 --- a/tasks/utilities/file_handling/task_transfer_files.wdl +++ b/tasks/utilities/file_handling/task_transfer_files.wdl @@ -14,6 +14,8 @@ task transfer_files { volatile: true } command <<< + set -euo pipefail + file_path_array="~{sep=' ' files_to_transfer}" gsutil -m cp -n ${file_path_array[@]} ~{target_bucket} diff --git a/tasks/utilities/submission/task_submission.wdl b/tasks/utilities/submission/task_submission.wdl index 694b4f0e8..ab384c86b 100644 --- a/tasks/utilities/submission/task_submission.wdl +++ b/tasks/utilities/submission/task_submission.wdl @@ -23,6 +23,8 @@ task prune_table { volatile: true } command <<< + set -euo pipefail + # when running on terra, comment out all input_table mentions python3 /scripts/export_large_tsv/export_large_tsv.py --project "~{project_name}" --workspace "~{workspace_name}" --entity_type ~{table_name} --tsv_filename ~{table_name}-data.tsv diff --git a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml index b33428777..aad099a4e 100644 --- a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml +++ b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_pe.yml @@ -627,7 +627,7 @@ - path: miniwdl_run/wdl/tasks/taxon_id/contamination/task_midas.wdl md5sum: 64caaaff5910ac0036e2659434500962 - path: miniwdl_run/wdl/tasks/utilities/data_export/task_broad_terra_tools.wdl - md5sum: 8c97c5bd65e2787239f12ef425d479ae + md5sum: 850ad97598aca5c28eb36e6a5c13c2fc - path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl md5sum: d8db687487a45536d4837a540ed2a135 - path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl diff --git a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml index 60a1a2fa0..6a7e2a86a 100644 --- a/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml +++ b/tests/workflows/theiaprok/test_wf_theiaprok_illumina_se.yml @@ -590,7 +590,7 @@ - path: miniwdl_run/wdl/tasks/taxon_id/contamination/task_midas.wdl md5sum: 64caaaff5910ac0036e2659434500962 - path: miniwdl_run/wdl/tasks/utilities/data_export/task_broad_terra_tools.wdl - md5sum: 8c97c5bd65e2787239f12ef425d479ae + md5sum: 850ad97598aca5c28eb36e6a5c13c2fc - path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_se.wdl md5sum: 4111a758490174325ae8ea52a95319e9 - path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl