-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TheiaCov] wfs add percentage_mapped_reads #641
Conversation
…d stats_n_coverage task
0dc9640
to
33ab026
Compare
Also I was wrong and you do need to provide a default value for the percentage_mapped_reads in case the read_screen fails (see here). Could you coerce all of these non-optional outputs into Strings and provide
Thanks! |
just as a side note, do you have any examples where it's not 100%? |
No Id have to try and find data, unless you have any potential data that may provide this kind of result |
Those updates should be passing tests now, let me know if you need me to find more testing data after your tests have completed |
tasks/quality_control/basic_statistics/task_assembly_metrics.wdl
Outdated
Show resolved
Hide resolved
FYI I'm sharing this with 1 lab & user, so please keep the dev branch available after merging. I'll pass along any feedback I hear from them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🌟
This PR closes #507
🗑️ This dev branch should be deleted after merging to main.
🧠 Summary
This PR adds outputs for percentage_mapped_reads to various workflows, specifically targeting reads for flu and non-flu organisms, ensuring consistency in outputs.
⚡ Impacted Workflows/Tasks
theiacov_ont.wdl
theiacov_illumina_pe.wdl
theiacov_illumina_se.wdl
theiacov_clearlabs.wdl
This PR may lead to different results in pre-existing outputs: No
This PR uses an element that could cause duplicate runs to have different results: No
🛠️ Changes
Added unified output for percentage_mapped_reads across theiacov_ont, theiacov_illumina_pe, theiacov_illumina_se, and theiacov_clearlabs workflows.
Consolidated flu and non-flu percentage mapped reads using select_first to ensure a single output variable for mapped reads.
Refined logic for flu and non-flu workflows to ensure correct handling of percentage_mapped_reads.
⚙️ Algorithm
No major algorithmic changes were introduced, but the logic for flu and non-flu organisms in calculating percentage_mapped_reads was updated to call different tasks.
For iVar-based workflows (theiacov_illumina_pe, theiacov_illumina_se), the percentage is parsed from the samtools flagstat file.
For non-iVar workflows, the assembled_reads_percent task is used to pass in BAM files and calculate mapped reads.
➡️ Inputs
No new inputs were added.
⬅️ Outputs
The following outputs were updated or added:
percentage_mapped_reads:
For non-flu organisms, calculated using either ivar_consensus.percentage_mapped_reads (for iVar-based workflows) or from stats_n_coverage.percentage_mapped_reads for ONT workflows.
🧪 Testing
Tested both flu and non-flu cases across workflows, ensuring the correct mapping of percentage_mapped_reads.
TheiaCov_ONT (non flu)
https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Combe_Sandbox/job_history/a158d7fc-aac7-4c9e-8782-e8d96afe059a
TheiaCov_ONT (flu)
https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Combe_Sandbox/job_history/fd988d1f-5a75-4b7f-b1e5-ce293a345a13
ThieaCov_illumina_pe (flu)
https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Combe_Sandbox/job_history/3ca0d049-31b4-4179-b2ce-20753946f40c/ec42f3a3-7b30-4756-aa72-39640adb92a9
TheiaCov_illumina_pe (non-flu)
https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Combe_Sandbox/job_history/cd90a924-8fe9-48c3-814e-d4fce8453632
TheiaCov_illumina_se
https://app.terra.bio/#workspaces/theiagen-training-workspaces/Theiagen_Combe_Sandbox/job_history/dcfb1141-a483-4eb3-b280-7377a1df4a4e
Suggested Scenarios for Reviewer to Test
Run workflows with flu and non-flu samples to ensure the correct assignment of percentage_mapped_reads.
Validate the unified logic correctly handles percentage_mapped_reads for both flu and non-flu workflows, particularly for iVar-based and non-iVar-based workflows.
🔬 Final Developer Checklist
🎯 Reviewer Checklist