Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Taxon_Tables] Synchronize task_broad_terra_tools.wdl with current workflow outputs #741

Closed
wants to merge 9 commits into from

Conversation

xonq
Copy link
Member

@xonq xonq commented Jan 31, 2025

This PR closes #444

🗑️ This dev branch should be deleted after merging to main.

🧠 Summary

  • Synchronizes task_broad_terra_tools.wdl inputs, downstream dependencies' calls, and table output to correspond with current downstream workflow outputs.
  • Implements a partially automated solution for rectifying future desynchronization via PR 19 in the Theiagen utilities repo

⚡ Impacted Workflows/Tasks

  • task_broad_terra_tools.wdl
  • wf_theiaprok_fasta.wdl
  • wf_theiaprok_illumina_se.wdl
  • wf_theiaprok_illumina_pe.wdl
  • wf_theiaprok_ont.wdl

This PR may lead to different results in pre-existing outputs: Yes (table outputs are modified) !!! THIS WILL MAKE PREVIOUS TAXON TABLES GENERATED BY EARLIER VERSIONS INCOMPATIBLE WITH CONCATENATION DUE TO DISCREPANT COLUMN NAMING !!!

This PR uses an element that could cause duplicate runs to have different results: No

🛠️ Changes

  • Updated task_broad_terra_tools.wdl to synchronize with downstream Theia-level workflows
  • Developed a Python script to parse downstream dependencies of a WDL script, scrape the workflows' outputs, and highlight discrepancies for removal and population in task_broad_terra_tools.wdl, dependencies' calls, and documentation

NOTE: This update ensures task_broad_terra_tools.wdl ONLY accepts inputs that exactly match downstream dependencies' inputs or outputs. This means that any Theia- workflow that previously passed an output to task_broad_terra_tools.wdl that wasn't exposed to the Theia-workflow outputs is removed from task_broad_terra_tools.wdl. e.g. genotyphi_genotype_confidence = merlin_magic.genotyphi_genotype_confidence was removed from wf_theiaprok_fasta.wdl's call.

Some inputs into the terra_tools.export_taxon_tables call have simply changed names to correspond with the workflow's variable naming.

⚙️ Algorithm

Modified the outputs in the table generated by task_broad_terra_tools.wdl and adjusted its I/O to accommodate downstream workflows' outputs.

➡️ Inputs

Please refer to the Files Changed - all I/O changes are to task_broad_terra_tools.wdl and export_taxon_tables calls in downstream workflows

⬅️ Outputs

Please refer to the Files Changed - all I/O changes are to task_broad_terra_tools.wdl and export_taxon_tables calls in downstream workflows

🧪 Testing

Suggested Scenarios for Reviewer to Test

n/a - Recommended line-by-line review of removed and newly exposed I/O to ensure desired function is implemented

🔬 Final Developer Checklist

  • The workflow/task has been tested and results, including file contents, are as anticipated
  • The CI/CD has been adjusted and tests are passing (Theiagen developers)
  • Code changes follow the style guide
  • Documentation and/or workflow diagrams have been updated if applicable
    • You have updated the "Last Known Changes" field for any affected workflows in the respective workflow documentation page and for every entry in the three workflows_overview tables to be the tag for the next upcoming release. If you do not know the tag, please put "vX.X.X"

🎯 Reviewer Checklist

  • All changed results have been confirmed
  • You have tested the PR appropriately (see the testing guide for more information)
  • All code adheres to the style guide
  • MD5 sums have been updated
  • The PR author has addressed all comments
  • The documentation has been updated

@xonq xonq marked this pull request as ready for review February 7, 2025 22:51
@xonq xonq requested a review from a team as a code owner February 7, 2025 22:51
@sage-wright
Copy link
Member

Closing in favor of improved method 👍

@xonq xonq deleted the kzm-taxon_tables-dev branch February 19, 2025 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[taxon table] update the taxon tables to match workflow outputs
2 participants