Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Tiptoft] Deprecate Tiptoft #739

Merged
merged 6 commits into from
Feb 5, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 0 additions & 11 deletions docs/workflows/genomic_characterization/theiacov.md
Original file line number Diff line number Diff line change
Expand Up @@ -371,17 +371,6 @@ All TheiaCoV Workflows (not TheiaCoV_FASTA_Batch)
| read_QC_trim | **rasusa_memory** | Int | Internal component, do not modify | 8 | Optional | ONT | |
| read_QC_trim | **rasusa_number_of_reads** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **rasusa_seed** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **tiptoft_cpu** | Int | Internal component, do not modify | 2 | Optional | ONT | |
| read_QC_trim | **tiptoft_disk_size** | Int | Internal component, do not modify | 100 | Optional | ONT | |
| read_QC_trim | **tiptoft_docker** | String | Internal component, do not modify | "us-docker.pkg.dev/general-theiagen/staphb/tiptoft:1.0.2" | Optional | ONT | |
| read_QC_trim | **tiptoft_kmer_size** | String | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **tiptoft_margin** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **tiptoft_max_gap** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **tiptoft_memory** | Int | Internal component, do not modify | 8 | Optional | ONT | |
| read_QC_trim | **tiptoft_min_block_size** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **tiptoft_min_fasta_hits** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC-trim | **tiptoft_min_kmers_for_onex_pass** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **tiptoft_min_perc_coverage** | Int | Internal component, do not modify | | Optional | ONT | |
| read_QC_trim | **read_processing** | String | The name of the tool to perform basic read processing; options: "trimmomatic" or "fastp" | trimmomatic | Optional | PE, SE | |
| read_QC_trim | **read_qc** | String | The tool used for quality control (QC) of reads. Options are fastq_scan and fastqc | fastq_scan | Optional | PE, SE | HIV, MPXV, WNV, flu, rsv_a, rsv_b, sars-cov-2 |
| read_QC_trim | **target_organism** | String | Organism to search for in Kraken | | Optional | PE, SE | HIV, MPXV, WNV, flu, rsv_a, rsv_b, sars-cov-2 |
Expand Down
25 changes: 4 additions & 21 deletions docs/workflows/genomic_characterization/theiaprok.md
Original file line number Diff line number Diff line change
Expand Up @@ -279,9 +279,6 @@ All input reads are processed through "[core tasks](#core-tasks-performed-for-al
| export_taxon_tables | **theiaprok_illumina_se_version** | String | Internal component, do not modify | | Do not modify, Optional | FASTA, ONT, PE |
| export_taxon_tables | **theiaprok_ont_analysis_date** | String | Internal component, do not modify | | Do not modify, Optional | FASTA, PE, SE |
| export_taxon_tables | **theiaprok_ont_version** | String | Internal component, do not modify | | Do not modify, Optional | FASTA, PE, SE |
| export_taxon_tables | **tiptoft_plasmid_replicon_fastq** | File | Internal component, do not modify | | Do not modify, Optional | FASTA, PE, SE |
| export_taxon_tables | **tiptoft_plasmid_replicon_genes** | String | Internal component, do not modify | | Do not modify, Optional | FASTA, PE, SE |
| export_taxon_tables | **tiptoft_version** | String | Internal component, do not modify | | Do not modify, Optional | FASTA, PE, SE |
| export_taxon_tables | **trimmomatic_version** | String | Internal component, do not modify | | Do not modify, Optional | FASTA, ONT |
| gambit | **cpu** | Int | Number of CPUs to allocate to the task | 8 | Optional | FASTA, ONT, PE, SE |
| gambit | **disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | FASTA, ONT, PE, SE |
Expand Down Expand Up @@ -571,17 +568,6 @@ All input reads are processed through "[core tasks](#core-tasks-performed-for-al
| read_QC_trim | **read_qc** | String | Allows the user to decide between fastq_scan (default) and fastqc for the evaluation of read quality. | fastq_scan | Optional | PE, SE |
| read_QC_trim | **run_prefix** | String | Internal component, do not modify | | Do not modify, Optional | ONT |
| read_QC_trim | **target_organism** | String | This string is searched for in the kraken2 outputs to extract the read percentage | | Optional | ONT, PE, SE |
| read_QC_trim | **tiptoft_cpu** | Int | Number of CPUs to allocate to the task | 2 | Optional | ONT |
| read_QC_trim | **tiptoft_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional | ONT |
| read_QC_trim | **tiptoft_docker** | String | The Docker container to use for the task | "us-docker.pkg.dev/general-theiagen/staphb/tiptoft:1.0.2" | Optional | ONT |
| read_QC_trim | **tiptoft_kmer_size** | String | The kmer size | | Optional | ONT |
| read_QC_trim | **tiptoft_margin** | Int | Flanking region around a block to use for mapping | | Optional | ONT |
| read_QC_trim | **tiptoft_max_gap** | Int | Maximum gap for blocks to be contiguous, measured in multiples of the kmer size | | Optional | ONT |
| read_QC_trim | **tiptoft_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional | ONT |
| read_QC_trim | **tiptoft_min_block_size** | Int | Minimum block size in bases | | Optional | ONT |
| read_QC_trim | **tiptoft_min_fasta_hits** | Int | Minimum number of kmers matching a read | | Optional | ONT
| read_QC-trim | **tiptoft_min_kmers_for_onex_pass** | Int | Minimum number of kmers matching a read in 1st pass | | Optional | ONT |
| read_QC_trim | **tiptoft_min_perc_coverage** | Int | Minimum percentage ocoverage o typing sequence to report | | Optional | ONT |
| read_QC_trim | **trimmomatic_args** | String | Additional arguments to pass to trimmomatic. "-phred33" specifies the Phred Q score encoding which is almost always phred33 with modern sequence data. | -phred33 | Optional | PE, SE |
| resfinder_task | **acquired** | Boolean | Set to true to tell ResFinder to identify acquired resistance genes | TRUE | Optional | FASTA, ONT, PE, SE |
| resfinder_task | **call_pointfinder** | Boolean | Set to true to enable detection of point mutations. | FALSE | Optional | FASTA, ONT, PE, SE |
Expand Down Expand Up @@ -838,7 +824,7 @@ All input reads are processed through "[core tasks](#core-tasks-performed-for-al

**Read subsampling:** Samples are automatically randomly subsampled to 150X coverage using `RASUSA`.

**Plasmid prediction:** `tiptoft` is used to predict plasmid sequences directly from uncorrected long-read data. Plasmids are identified using replicon sequences used for typing from [PlasmidFinder](https://cge.food.dtu.dk/services/PlasmidFinder/).
**Plasmid prediction:** Plasmids are identified using replicon sequences used for typing from [PlasmidFinder](https://cge.food.dtu.dk/services/PlasmidFinder/).

**Read filtering:** Reads are filtered by length and quality using `nanoq`. By default, sequences with less than 500 basepairs and quality score lower than 10 are filtered out to improve assembly accuracy.

Expand All @@ -849,9 +835,9 @@ All input reads are processed through "[core tasks](#core-tasks-performed-for-al
| Workflow | **TheiaProk_ONT** |
| --- | --- |
| Sub-workflow | [wf_read_QC_trim_ont.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/workflows/utilities/wf_read_QC_trim_ont.wdl) |
| Tasks | [task_nanoplot.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/basic_statistics/task_nanoplot.wdl) [task_fastq_scan.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/basic_statistics/task_fastq_scan.wdl) [task_rasusa.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/utilities/task_rasusa.wdl) [task_nanoq.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/read_filtering/task_nanoq.wdl) [task_tiptoft.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/gene_typing/plasmid_detection/task_tiptoft.wdl) |
| Software Source Code | [fastq-scan](https://github.com/rpetit3/fastq-scan), [NanoPlot](https://github.com/wdecoster/NanoPlot), [RASUSA](https://github.com/mbhall88/rasusa), [tiptoft](https://github.com/andrewjpage/tiptoft), [nanoq](https://github.com/esteinig/nanoq) |
| Original Publication(s) | [NanoPlot paper](https://academic.oup.com/bioinformatics/article/39/5/btad311/7160911)<br>[RASUSA paper](https://doi.org/10.21105/joss.03941)<br>[Nanoq Paper](https://doi.org/10.21105/joss.02991)<br>[Tiptoft paper](https://doi.org/10.21105/joss.01021) |
| Tasks | [task_nanoplot.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/basic_statistics/task_nanoplot.wdl) [task_fastq_scan.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/basic_statistics/task_fastq_scan.wdl) [task_rasusa.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/utilities/task_rasusa.wdl) [task_nanoq.wdl](https://github.com/theiagen/public_health_bioinformatics/blob/main/tasks/quality_control/read_filtering/task_nanoq.wdl)
| Software Source Code | [fastq-scan](https://github.com/rpetit3/fastq-scan), [NanoPlot](https://github.com/wdecoster/NanoPlot), [RASUSA](https://github.com/mbhall88/rasusa), [nanoq](https://github.com/esteinig/nanoq) |
| Original Publication(s) | [NanoPlot paper](https://academic.oup.com/bioinformatics/article/39/5/btad311/7160911)<br>[RASUSA paper](https://doi.org/10.21105/joss.03941)<br>[Nanoq Paper](https://doi.org/10.21105/joss.02991)<br> |

??? task "`dragonflye`: _De novo_ Assembly"
!!! techdetails "dragonflye Technical Details"
Expand Down Expand Up @@ -2103,9 +2089,6 @@ The TheiaProk workflows automatically activate taxa-specific sub-workflows after
| theiaprok_illumina_se_version | String | Version of TheiaProk SE workflow execution | SE |
| theiaprok_ont_analysis_date | String | Date of TheiaProk ONT workflow execution | ONT |
| theiaprok_ont_version | String | Version of TheiaProk ONT workflow execution | ONT |
| tiptoft_plasmid_replicon_fastq | File | File produced by tiptoft that contains reads containing plasmid rep/inc genes | ONT |
| tiptoft_plasmid_replicon_genes | String | Rep/inc genes found in sample | ONT |
| tiptoft_version | String | Version of tiptoft used for analysis | ONT |
| trimmomatic_docker | String | Docker image used for trimmomatic | PE, SE |
| trimmomatic_version | String | Version of trimmomatic used | PE, SE |
| ts_mlst_allelic_profile | String | Profile of MLST loci and allele numbers predicted by MLST | FASTA, ONT, PE, SE |
Expand Down
64 changes: 0 additions & 64 deletions tasks/gene_typing/plasmid_detection/task_tiptoft.wdl

This file was deleted.

6 changes: 0 additions & 6 deletions tasks/utilities/data_export/task_broad_terra_tools.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -81,9 +81,6 @@ task export_taxon_tables {
Float? nanoplot_r1_median_q_clean
Float? nanoplot_r1_est_coverage_clean
String? rasusa_version
File? tiptoft_plasmid_replicon_fastq
String? tiptoft_plasmid_replicon_genes
String? tiptoft_version
File? assembly_fasta
File? contigs_gfa
String? dragonflye_version
Expand Down Expand Up @@ -497,9 +494,6 @@ task export_taxon_tables {
"nanoplot_r1_median_q_clean": "~{nanoplot_r1_median_q_clean}",
"nanoplot_r1_est_coverage_clean": "~{nanoplot_r1_est_coverage_clean}",
"rasusa_version": "~{rasusa_version}",
"tiptoft_plasmid_replicon_fastq": "~{tiptoft_plasmid_replicon_fastq}",
"tiptoft_plasmid_replicon_genes": "~{tiptoft_plasmid_replicon_genes}",
"tiptoft_version": "~{tiptoft_version}",
"assembly_fasta": "~{assembly_fasta}",
"contigs_gfa": "~{contigs_gfa}",
"dragonflye_version": "~{dragonflye_version}",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -512,7 +512,7 @@
- path: miniwdl_run/wdl/tasks/taxon_id/contamination/task_midas.wdl
md5sum: 64caaaff5910ac0036e2659434500962
- path: miniwdl_run/wdl/tasks/utilities/data_export/task_broad_terra_tools.wdl
md5sum: 8c97c5bd65e2787239f12ef425d479ae
md5sum: 59e18911ba07c16e01df38abe0e70477
- path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_pe.wdl
md5sum: 9b8e2da62c8572a369c786a9bbc3a36e
- path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -483,7 +483,7 @@
- path: miniwdl_run/wdl/tasks/taxon_id/contamination/task_midas.wdl
md5sum: 64caaaff5910ac0036e2659434500962
- path: miniwdl_run/wdl/tasks/utilities/data_export/task_broad_terra_tools.wdl
md5sum: 8c97c5bd65e2787239f12ef425d479ae
md5sum: 59e18911ba07c16e01df38abe0e70477
- path: miniwdl_run/wdl/workflows/theiaprok/wf_theiaprok_illumina_se.wdl
md5sum: 02dc0075bf28d557d7b81aa2dc61feab
- path: miniwdl_run/wdl/workflows/utilities/wf_merlin_magic.wdl
Expand Down
7 changes: 0 additions & 7 deletions workflows/theiaprok/wf_theiaprok_ont.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -276,9 +276,6 @@ workflow theiaprok_ont {
nanoplot_r1_median_q_clean = nanoplot_clean.median_q,
nanoplot_r1_est_coverage_clean = nanoplot_clean.est_coverage,
rasusa_version = read_qc_trim.rasusa_version,
tiptoft_plasmid_replicon_fastq = read_qc_trim.tiptoft_plasmid_replicon_fastq,
tiptoft_plasmid_replicon_genes = read_qc_trim.tiptoft_plasmid_replicon_genes,
tiptoft_version = read_qc_trim.tiptoft_version,
assembly_fasta = dragonflye.assembly_fasta,
contigs_gfa = dragonflye.contigs_gfa,
dragonflye_version = dragonflye.dragonflye_version,
Expand Down Expand Up @@ -589,10 +586,6 @@ workflow theiaprok_ont {
String? kraken_docker = read_qc_trim.kraken_docker
# Read QC - rasusa outputs
String? rasusa_version = read_qc_trim.rasusa_version
# Read QC - tiptoft outputs
File? tiptoft_plasmid_replicon_fastq = read_qc_trim.tiptoft_plasmid_replicon_fastq
String? tiptoft_plasmid_replicon_genes = read_qc_trim.tiptoft_plasmid_replicon_genes
String? tiptoft_version = read_qc_trim.tiptoft_version
# Assembly - dragonflye outputs
File? assembly_fasta = dragonflye.assembly_fasta
File? contigs_gfa = dragonflye.contigs_gfa
Expand Down
39 changes: 1 addition & 38 deletions workflows/utilities/wf_read_QC_trim_ont.wdl
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
version 1.0

import "../../tasks/gene_typing/plasmid_detection/task_tiptoft.wdl" as tiptoft_task
import "../../tasks/quality_control/read_filtering/task_artic_guppyplex.wdl" as artic_guppyplex
import "../../tasks/quality_control/read_filtering/task_nanoq.wdl" as nanoq_task
import "../../tasks/quality_control/read_filtering/task_ncbi_scrub.wdl" as ncbi_scrub
Expand All @@ -9,7 +8,7 @@ import "../../tasks/utilities/task_rasusa.wdl" as rasusa_task

workflow read_QC_trim_ont {
meta {
description: "Runs basic QC on Oxford Nanopore (ONT) reads with nanoplot, rasusa downsampling, tiptoft plasmid detection, and nanoq filtering"
description: "Runs basic QC on Oxford Nanopore (ONT) reads with nanoplot, rasusa downsampling, and nanoq filtering"
}
input {
String samplename
Expand Down Expand Up @@ -59,19 +58,6 @@ workflow read_QC_trim_ont {
Float? rasusa_fraction_of_reads
Int? rasusa_number_of_reads

# tiptoft inputs
Int? tiptoft_cpu
Int? tiptoft_disk_size
String? tiptoft_docker
Int? tiptoft_memory
Int? tiptoft_kmer_size
Int? tiptoft_max_gap
Int? tiptoft_margin
Int? tiptoft_min_block_size
Int? tiptoft_min_fasta_hits
Int? tiptoft_min_kmers_for_onex_pass
Int? tiptoft_min_perc_coverage

# nanoq inputs
Int? nanoq_cpu
Int? nanoq_disk_size
Expand Down Expand Up @@ -179,23 +165,6 @@ workflow read_QC_trim_ont {
seed = rasusa_seed

}
# tiptoft for plasmid detection
call tiptoft_task.tiptoft {
input:
read1 = read1,
samplename = samplename,
cpu = tiptoft_cpu,
disk_size = tiptoft_disk_size,
docker = tiptoft_docker,
kmer_size = tiptoft_kmer_size,
margin = tiptoft_margin,
max_gap = tiptoft_max_gap,
memory = tiptoft_memory,
min_block_size = tiptoft_min_block_size,
min_fasta_hits = tiptoft_min_fasta_hits,
min_kmers_for_onex_pass = tiptoft_min_kmers_for_onex_pass,
min_perc_coverage = tiptoft_min_perc_coverage
}
# nanoq/filtlong (default min length 500)
call nanoq_task.nanoq {
input:
Expand Down Expand Up @@ -239,11 +208,5 @@ workflow read_QC_trim_ont {

# rasusa outputs
String? rasusa_version = rasusa.rasusa_version

# tiptoft outputs
File? tiptoft_plasmid_replicon_fastq = tiptoft.tiptoft_plasmid_replicon_fastq
File? tiptoft_result_tsv = tiptoft.tiptoft_tsv
String? tiptoft_plasmid_replicon_genes = tiptoft.plasmid_replicon_genes
String? tiptoft_version = tiptoft.tiptoft_version
}
}