Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raredisease add GC and AT dropout quality check #3838

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

peterpru
Copy link
Member

@peterpru peterpru commented Oct 11, 2024

Description

In raredisease we want to add a GC dropout and AT dropout QC check to make sure we do not deliver low quality data. Details on the issue: https://github.com/Clinical-Genomics/MTP-RAREDISEASE/issues/25

Requested cutoffs by production management:
WES: 10 for GC and AT dropout
WGS: 5 for GC and AT dropout

Suggested solution here is to set the cutoff at 10 for all sample types, and if the analysis type of a sample is WGS, it will set them to 5.

Added

  • GC dropout QC check to raredisease
  • AT dropout QC check to raredisease

Changed

Fixed

How to prepare for test

  • Ssh to relevant server (depending on type of change)
  • Use stage: us
  • Paxa the environment: paxa
  • Install on stage (example for Hasta):
    bash /home/proj/production/servers/resources/hasta.scilifelab.se/update-tool-stage.sh -e S_cg -t cg -b raredisease-add-qc-check-at-gc-dropout

How to test

  • Do cg workflow raredisease metrics-deliver amusingmarmoset, it should complete successfully.

Expected test outcome

  • Check that it completes successfully, and that the thresholds are visible in the output.
  • Take a screenshot and attach or copy/paste the output.
image image

Review

  • Tests executed by
  • "Merge and deploy" approved by
    Thanks for filling in who performed the code review and the test!

This version is a

  • MAJOR - when you make incompatible API changes
  • MINOR - when you add functionality in a backwards compatible manner
  • PATCH - when you make backwards compatible bug fixes or documentation/instructions

Implementation Plan

  • Document in ...
  • Deploy this branch on ...
  • Inform to ...

@peterpru peterpru self-assigned this Oct 11, 2024
Copy link

@peterpru peterpru marked this pull request as ready for review January 16, 2025 13:38
@peterpru peterpru requested a review from a team as a code owner January 16, 2025 13:38
@peterpru peterpru added the raredisease Issues related to nextflow raredisease pipeline label Jan 16, 2025
Comment on lines +235 to +242
@staticmethod
def set_dropout_cutoff_by_analysis_type(sample: Sample, metric_conditions: dict) -> None:
if (
sample.application_version.application.analysis_type
== SeqLibraryPrepCategory.WHOLE_GENOME_SEQUENCING
):
metric_conditions["AT_DROPOUT"]["threshold"] = 5
metric_conditions["GC_DROPOUT"]["threshold"] = 5
Copy link
Contributor

@ChrOertlin ChrOertlin Jan 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic sets the treshhold for the AT and GC dropout to 5 in case of WGS. I think it would be better to pre-define a full set of Rare-disease WGS metrics and fetch the whole set of metrics when required.

I fear this logic here will be lost (especially if undocumented) later on. Whereas having two sets of QC threshold collections would be clearer. Plus, in case we would require other metrics with different tresholds we can modify the collection, rather than writing more functions with if statements.

My suggestion:

  1. Create RAREDISEASE_WGS_METRICS_CONDITIONS
  2. Create a function to `fetch_raredisease_metrics_conditions(prep_category)
  3. Use the set of fetched metrics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
raredisease Issues related to nextflow raredisease pipeline
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants