-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TheiaProk Workflows] Add Kraken2 as optional module #286
Conversation
…tion from cromwell JSON
@@ -107,6 +109,17 @@ workflow read_QC_trim_pe { | |||
midas_db = midas_db | |||
} | |||
} | |||
if ("~{workflow_series}" == "theiaprok") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I like how this is done by taking advantage of our workflow series tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a suggestion by @michellescribner ! :D
@@ -52,6 +52,7 @@ task kraken2_theiacov { | |||
Float percent_sc2 = read_float("PERCENT_SC2") | |||
String percent_target_org = read_string("PERCENT_TARGET_ORG") | |||
String? kraken_target_org = target_org | |||
String docker = "us-docker.pkg.dev/general-theiagen/staphb/kraken2:2.0.8-beta_hv" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, great catch. Didn't realize we weren't capturing this already.
input: | ||
samplename = samplename, | ||
read1 = read1_raw, | ||
kraken2_db = select_first([kraken_db]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These things always throw me off. Just to make sure I have this right: since the kraken2_db
workflow variable is optional, but the task variable is obligate, we have to use this select_first() function, otherwise the WDL syntax check will prevent it from importing into Terra--is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes! it even fails our internal CI with miniwdl check!
Launched TheiaProk tests: Launched TheiaCoV function tests: Also gave TheiaProk_ONT_PHB a function test: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code changes look great and I appreciate the functional tests on Terra from @michellescribner. All my comments are really just comments and questions rather than any need for modifications. @cimendes this PR is ready for merge!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, wait a minute. Just realizing now that @michellescribner's tests with TheiaProk properly produced the optional Kraken output, but that the TheiaCoV workflows are failling.
Dismissing initial approval based on theiacov failures
Seemingly an issue with a required input and not changes made in this PR. Relaunching some of @michellescribner's tests with inputs defined to verify: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good after re-running those TheiaCoV samples with all required inputs: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job Ines!
Last minute I also launched a quick function test for TheiaProk without implementing the kraken module https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/2f7adb95-9df7-4ca6-9ffe-ee0efbd04711 |
The documentation has been updated to reflect this merge! |
Closes #279
🛠️ Changes Being Made
This PR adds Kraken2 as an optional module in the TheiaProk suite of workflows (Illumina PE and SE). Once PR #240 gets merged, Kraken2 will be possible to be added as an optional module for TheiaProk_ONT
Impacted Workflows/Tasks
🧠 Context and Rationale
Requested to facilitate contamination detection. A database must be provided to the module.
📋 Workflow/Task Steps
Kraken2 was integrated on to the Trim SE and Trim PE sub-workflows
Inputs
New optional inputs:
Outputs
New optional outputs for TheiaProk:
Impacted Outputs
🧪 Testing
UnderwayLocally
Not tested locally
Terra
Kraken2 Database used:
gs://theiagen-public-files-rp/terra/theiaprok-files/k2_standard_08gb_20230605.tar.gz
-TheiaProk_Illumina_SE: 23 Kleb pneumoniae: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Mendes_Sandbox/job_history/5bb4c1e4-5977-45e9-973a-c6ae3264b3dd (6 expected failures due to missing read data)
Scenarios for Reviewer to Test
Check for new functionality:
Check that no conflicts were created:
🔬 Quality checks
Pull Request (PR) checklist: