Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kraken2] Split database from Kraken2 TheiaCoV task #608

Closed
wants to merge 56 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
fd3ed46
split kraken database and tool, use standalone task with default data…
jrotieno Aug 28, 2024
428be7f
renaming kraken2 task calls to just kraken instead of suffixing with …
jrotieno Sep 6, 2024
201e298
update output name
jrotieno Sep 6, 2024
3b99abf
renaming kraken outputs to kraken2
jrotieno Sep 6, 2024
7f672a0
additional kraken outputs
jrotieno Sep 6, 2024
0ea39a8
Merge branch 'main' into jro-kraken-split-database-and-task
jrotieno Sep 6, 2024
7be05c0
clearlabs outputs fix
jrotieno Sep 6, 2024
0ca1515
Merge branch 'jro-kraken-split-database-and-task' of https://github.c…
jrotieno Sep 6, 2024
9aa1681
updating RSV Kraken2 target organism identifiers and exposing the Kra…
jrotieno Sep 9, 2024
88f2279
md5sum
jrotieno Sep 9, 2024
f9a2ef9
inputs to manage CI errors
jrotieno Sep 17, 2024
b838b15
CI error, again!
jrotieno Sep 17, 2024
e10df04
adding a test kraken database for CI
jrotieno Sep 17, 2024
c388970
fix test theiacov inputs
jrotieno Sep 17, 2024
9e3eadf
md5sum
jrotieno Sep 20, 2024
bd6fba9
optional target_organism for theiacov SE
jrotieno Sep 20, 2024
8e8bc0b
md5sum
jrotieno Sep 20, 2024
d69d01d
new test database
jrotieno Sep 30, 2024
a2a68fb
updated test kraken database
jrotieno Sep 30, 2024
644f69e
md5sum
jrotieno Sep 30, 2024
912327c
update CI for kraken2 report in theiacov clearlabs, ilmn pe, and ilmn…
kapsakcj Sep 30, 2024
64a619e
update CI
cimendes Oct 7, 2024
258e5f3
Merge branch 'main' into jro-kraken-split-database-and-task
cimendes Oct 7, 2024
0d73768
update ci
cimendes Oct 7, 2024
07c081e
fiz ouput workflow name
cimendes Oct 7, 2024
c7925c4
update docs - kraken2 standalone
cimendes Oct 7, 2024
098d982
update ci again
cimendes Oct 7, 2024
0659d74
hide call_kraken from input table
cimendes Oct 17, 2024
d7c8795
update input table for TheiaCoV
cimendes Oct 17, 2024
576efa8
update outputs for theiacov
cimendes Oct 17, 2024
07736d4
report SC2 proportion only if target organisms is SC2 - TheiaCoV clea…
cimendes Oct 18, 2024
9ab5ac0
update CI
cimendes Oct 18, 2024
7bbf779
make TheiaCoV ONT compatible
cimendes Oct 18, 2024
84292df
update docs - theiacov outputs
cimendes Oct 18, 2024
cc69a98
CI once more
cimendes Oct 18, 2024
e4022ca
forgot to change output types
cimendes Oct 21, 2024
2629708
solve parsing issue - it was a BUG!!!! :bug:
cimendes Oct 21, 2024
6fe1875
no more bugs hopefully :buh:
cimendes Oct 21, 2024
729ba4a
this CI is never happy :bug:
cimendes Oct 21, 2024
e9969fc
Merge branch 'main' into jro-kraken-split-database-and-task
cimendes Oct 25, 2024
5fe7123
rename kraken2 outputs to match other theiacov, rename kraken2_db input
cimendes Oct 25, 2024
88bc2f7
kraken2_db
cimendes Oct 25, 2024
bb90802
kraken2_db
cimendes Oct 25, 2024
fe5e93d
kraken2_db
cimendes Oct 25, 2024
a28ff96
kraken -> kraken2
cimendes Oct 25, 2024
e529d19
kraken -> kraken2
cimendes Oct 25, 2024
72dc913
more kraken -> kraken2
cimendes Oct 25, 2024
9e17028
kraken -> kraken2 continued
cimendes Oct 25, 2024
bcddf62
more kraken -> kraken2
cimendes Oct 25, 2024
4b6d267
krakren -> kraken2
cimendes Oct 25, 2024
bc72236
last kraken -> kraken2 (ignoring nullabor)
cimendes Oct 25, 2024
283d7e1
fix output declaration
cimendes Oct 25, 2024
6e7e0ef
forgot about the ncbi_scrub standalone wfs again
cimendes Oct 25, 2024
35e9d74
update CI for theiaprok and making freyja_fastq functional with the n…
cimendes Oct 25, 2024
11d56dd
change output type
cimendes Oct 25, 2024
51b1c45
add missing pe
cimendes Oct 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,12 @@ assembly_length_unambiguous 0.01
assembly_mean_coverage 0.01
irma_subtype EXACT
irma_type EXACT
kraken_human EXACT
kraken_human_dehosted EXACT
kraken_sc2 EXACT
kraken_sc2_dehosted EXACT
kraken_target_org EXACT
kraken_target_org_dehosted EXACT
kraken2_human EXACT
kraken2_human_dehosted EXACT
kraken2_sc2 EXACT
kraken2_sc2_dehosted EXACT
kraken2_target_org EXACT
kraken2_target_org_dehosted EXACT
nextclade_aa_dels SET
nextclade_aa_subs SET
nextclade_clade EXACT
Expand Down
44 changes: 22 additions & 22 deletions docs/workflows/genomic_characterization/freyja.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,13 +146,13 @@ This workflow runs on the sample level.
| primer_trim | **memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_pe | **adapters** | File | A FASTA file containing adapter sequence | None | Optional |
| read_QC_trim_pe | **bbduk_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_pe | **call_kraken** | Boolean | By default this is set to false to skip kraken2; set to true to run kraken2 but a database must be also provided via the kraken_db input parameter for this to run successfully | FALSE | Optional |
| read_QC_trim_pe | **call_kraken** | Boolean | By default this is set to false to skip kraken2; set to true to run kraken2 but a database must be also provided via the kraken2_db input parameter for this to run successfully | FALSE | Optional |
| read_QC_trim_pe | **call_midas** | Boolean | By default this is set to true to run MIDAS; set to false to skip MIDAS | FALSE | Optional |
| read_QC_trim_pe | **fastp_args** | String | Additional arguments to use with fastp | "--detect_adapter_for_pe -g -5 20 -3 20" | Optional |
| read_QC_trim_pe | **kraken_cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional |
| read_QC_trim_pe | **kraken_db** | File | A kraken2 database to use with the kraken2 optional task. The file must be a .tar.gz kraken2 database. | None | Optional, Sometimes required |
| read_QC_trim_pe | **kraken_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| read_QC_trim_pe | **kraken_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_pe | **kraken2_cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional |
| read_QC_trim_pe | **kraken2_db** | File | A kraken2 database to use with the kraken2 optional task. The file must be a .tar.gz kraken2 database. | None | Optional, Sometimes required |
| read_QC_trim_pe | **kraken2_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| read_QC_trim_pe | **kraken2_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_pe | **midas_db** | File | Database to use with MIDAS. Not required as one will be auto-selected when running the MIDAS task. | None | Optional, Sometimes required |
| read_QC_trim_pe | **phix** | File | The file containing the phix sequence to be used during bbduk task | None | Optional |
| read_QC_trim_pe | **read_processing** | String | Options: "trimmomatic" or "fastp" to indicate which read trimming module to use | "trimmomatic" | Optional |
Expand All @@ -161,26 +161,26 @@ This workflow runs on the sample level.
| read_QC_trim_pe | **trim_quality_trim_score** | Int | The minimum quality score to keep during trimming | 30 | Optional |
| read_QC_trim_pe | **trim_window_size** | Int | The window size to use during trimming | 4 | Optional |
| read_QC_trim_pe | **trimmomatic_args** | String | Additional command-line arguments to use with trimmomatic | None | Optional |
| read_QC_trim_ont | **call_kraken** | Boolean | By default this is set to false to skip kraken2; set to true to run kraken2 but a database must be also provided via the kraken_db input parameter for this to run successfully | FALSE | Optional |
| read_QC_trim_ont | **call_kraken2** | Boolean | By default this is set to false to skip kraken2; set to true to run kraken2 but a database must be also provided via the kraken2_db input parameter for this to run successfully | FALSE | Optional |
| read_QC_trim_ont | **downsampling_coverage** | Float | The depth to downsample to with Rasusa. Internal component. Do not modify. | 150 | Do not modify, Optional |
| read_QC_trim_ont | **genome_length** | Int | Internal component. Do not modify | None | Do not modify, Optional |
| read_QC_trim_ont | **kraken_cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional |
| read_QC_trim_ont | **kraken_db** | File | A kraken2 database to use with the kraken2 optional task. The file must be a .tar.gz kraken2 database. | None | Optional |
| read_QC_trim_ont | **kraken_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| read_QC_trim_ont | **kraken_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_ont | **kraken2_cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional |
| read_QC_trim_ont | **kraken2_db** | File | A kraken2 database to use with the kraken2 optional task. The file must be a .tar.gz kraken2 database. | None | Optional |
| read_QC_trim_ont | **kraken2_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| read_QC_trim_ont | **kraken2_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_ont | **max_length** | Int | Internal component, do not modify | | Do not modify, Optional |
| read_QC_trim_ont | **min_length** | Int | Internal component, do not modify | | Do not modify, Optional |
| read_QC_trim_ont | **run_prefix** | String | Internal component, do not modify | | Do not modify, Optional |
| read_QC_trim_ont | **target_organism** | String | This string is searched for in the kraken2 outputs to extract the read percentage | | Optional |
| read_QC_trim_se | **adapters** | File | A FASTA file containing adapter sequence | None | Optional |
| read_QC_trim_se | **bbduk_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_se | **call_kraken** | Boolean | By default this is set to false to skip kraken2; set to true to run kraken2 but a database must be also provided via the kraken_db input parameter for this to run successfully | FALSE | Optional |
| read_QC_trim_se | **call_kraken** | Boolean | By default this is set to false to skip kraken2; set to true to run kraken2 but a database must be also provided via the kraken2_db input parameter for this to run successfully | FALSE | Optional |
| read_QC_trim_se | **call_midas** | Boolean | By default this is set to true to run MIDAS; set to false to skip MIDAS | FALSE | Optional |
| read_QC_trim_se | **fastp_args** | String | Additional arguments to use with fastp | "--detect_adapter_for_pe -g -5 20 -3 20" | Optional |
| read_QC_trim_se | **kraken_cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional |
| read_QC_trim_se | **kraken_db** | File | A kraken2 database to use with the kraken2 optional task. The file must be a .tar.gz kraken2 database. | None | Optional |
| read_QC_trim_se | **kraken_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| read_QC_trim_se | **kraken_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_se | **kraken2_cpu** | Int | Number of CPUs to allocate to the task | 4 | Optional |
| read_QC_trim_se | **kraken2_db** | File | A kraken2 database to use with the kraken2 optional task. The file must be a .tar.gz kraken2 database. | None | Optional |
| read_QC_trim_se | **kraken2_disk_size** | Int | Amount of storage (in GB) to allocate to the task | 100 | Optional |
| read_QC_trim_se | **kraken2_memory** | Int | Amount of memory/RAM (in GB) to allocate to the task | 8 | Optional |
| read_QC_trim_se | **midas_db** | File | Database to use with MIDAS. Not required as one will be auto-selected when running the MIDAS task. | None | Optional, Sometimes required |
| read_QC_trim_se | **phix** | File | The file containing the phix sequence to be used during bbduk task | None | Optional |
| read_QC_trim_se | **read_processing** | String | Options: "trimmomatic" or "fastp" to indicate which read trimming module to use | "trimmomatic" | Optional |
Expand Down Expand Up @@ -363,13 +363,13 @@ The main output file used in subsequent Freyja workflows is found under the `fre
| freyja_variants | File | The TSV file containing the variants identified by Freyja | ONT, PE, SE |
| freyja_version | String | version of Freyja used | ONT, PE, SE |
| ivar_version_primtrim | String | Version of iVar for running the iVar trim command | ONT, PE, SE |
| kraken_human | Float | Percent of human read data detected using the Kraken2 software | ONT, PE, SE |
| kraken_human_dehosted | Float | Percent of human read data detected using the Kraken2 software after host removal | ONT, PE, SE |
| kraken_report | File | Full Kraken report | ONT, PE, SE |
| kraken_report_dehosted | File | Full Kraken report after host removal | ONT, PE, SE |
| kraken_sc2 | Float | Percent of SARS-CoV-2 read data detected using the Kraken2 software | ONT, PE, SE |
| kraken_sc2_dehosted | Float | Percent of SARS-CoV-2 read data detected using the Kraken2 software after host removal | ONT, PE, SE |
| kraken_version | String | Version of Kraken software used | ONT, PE, SE |
| kraken2_human | Float | Percent of human read data detected using the Kraken2 software | ONT, PE, SE |
| kraken2_human_dehosted | Float | Percent of human read data detected using the Kraken2 software after host removal | ONT, PE, SE |
| kraken2_report | File | Full Kraken report | ONT, PE, SE |
| kraken2_report_dehosted | File | Full Kraken report after host removal | ONT, PE, SE |
| kraken2_sc2 | Float | Percent of SARS-CoV-2 read data detected using the Kraken2 software | ONT, PE, SE |
| kraken2_sc2_dehosted | Float | Percent of SARS-CoV-2 read data detected using the Kraken2 software after host removal | ONT, PE, SE |
| kraken2_version | String | Version of Kraken software used | ONT, PE, SE |
| minimap2_docker | String | Docker image used to run minimap2 | ONT |
| minimap2_version | String | Version of minimap2 used | ONT |
| nanoplot_html_clean | File | Clean read file | ONT |
Expand Down
Loading