Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TheiaProk Workflows] Add Kraken2 as optional module #286

Merged
merged 17 commits into from
Jan 8, 2024

Conversation

cimendes
Copy link
Member

@cimendes cimendes commented Dec 22, 2023

Closes #279

🛠️ Changes Being Made

This PR adds Kraken2 as an optional module in the TheiaProk suite of workflows (Illumina PE and SE). Once PR #240 gets merged, Kraken2 will be possible to be added as an optional module for TheiaProk_ONT

Impacted Workflows/Tasks

  • TheiaProk PE
  • TheiaProk SE

🧠 Context and Rationale

Requested to facilitate contamination detection. A database must be provided to the module.

📋 Workflow/Task Steps

Kraken2 was integrated on to the Trim SE and Trim PE sub-workflows

Inputs

New optional inputs:

Boolean call_kraken = false
File? kraken_db

Outputs

New optional outputs for TheiaProk:

String? kraken2_version = read_QC_trim.kraken_version
File? kraken2_report = read_QC_trim.kraken_report
String? kraken2_docker = read_QC_trim.kraken_docker

Impacted Outputs

🧪 Testing

Underway

Locally

Not tested locally

Terra

Kraken2 Database used: gs://theiagen-public-files-rp/terra/theiaprok-files/k2_standard_08gb_20230605.tar.gz

Scenarios for Reviewer to Test

Check for new functionality:

  • TheiaProk SE
  • TheiaProk PE

Check that no conflicts were created:

  • TheiaCoV PE
  • TheiaCoV SE

🔬 Quality checks

Pull Request (PR) checklist:

  • Include a description of what is in this pull request in this message.
  • The workflow/task has been tested locally and on Terra
  • The CI/CD has been adjusted and tests are passing
  • Everything follows the style guide

@cimendes cimendes requested a review from kevinlibuit January 3, 2024 16:43
@cimendes cimendes marked this pull request as ready for review January 3, 2024 17:50
@@ -107,6 +109,17 @@ workflow read_QC_trim_pe {
midas_db = midas_db
}
}
if ("~{workflow_series}" == "theiaprok") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I like how this is done by taking advantage of our workflow series tag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a suggestion by @michellescribner ! :D

@@ -52,6 +52,7 @@ task kraken2_theiacov {
Float percent_sc2 = read_float("PERCENT_SC2")
String percent_target_org = read_string("PERCENT_TARGET_ORG")
String? kraken_target_org = target_org
String docker = "us-docker.pkg.dev/general-theiagen/staphb/kraken2:2.0.8-beta_hv"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, great catch. Didn't realize we weren't capturing this already.

input:
samplename = samplename,
read1 = read1_raw,
kraken2_db = select_first([kraken_db])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These things always throw me off. Just to make sure I have this right: since the kraken2_db workflow variable is optional, but the task variable is obligate, we have to use this select_first() function, otherwise the WDL syntax check will prevent it from importing into Terra--is that correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! it even fails our internal CI with miniwdl check!

kevinlibuit
kevinlibuit previously approved these changes Jan 7, 2024
Copy link
Contributor

@kevinlibuit kevinlibuit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code changes look great and I appreciate the functional tests on Terra from @michellescribner. All my comments are really just comments and questions rather than any need for modifications. @cimendes this PR is ready for merge!

Copy link
Contributor

@kevinlibuit kevinlibuit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, wait a minute. Just realizing now that @michellescribner's tests with TheiaProk properly produced the optional Kraken output, but that the TheiaCoV workflows are failling.

@kevinlibuit kevinlibuit dismissed their stale review January 7, 2024 18:30

Dismissing initial approval based on theiacov failures

@kevinlibuit
Copy link
Contributor

Seemingly an issue with a required input and not changes made in this PR. Relaunching some of @michellescribner's tests with inputs defined to verify:

Copy link
Contributor

@kevinlibuit kevinlibuit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good after re-running those TheiaCoV samples with all required inputs: https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history

Copy link
Contributor

@michellescribner michellescribner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job Ines!

@michellescribner
Copy link
Contributor

Last minute I also launched a quick function test for TheiaProk without implementing the kraken module https://app.terra.bio/#workspaces/theiagen-validations/Theiagen_Scribner_Sandbox/job_history/2f7adb95-9df7-4ca6-9ffe-ee0efbd04711

@cimendes cimendes merged commit b3503f0 into main Jan 8, 2024
17 checks passed
@cimendes cimendes deleted the im-kraken2-to-theiaprok branch January 8, 2024 18:47
@cimendes
Copy link
Member Author

cimendes commented Jan 9, 2024

The documentation has been updated to reflect this merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TheiaProk] Add Kraken2 as optional module
3 participants