Skip to content

Re-implementation of the CADA phenotype-based prioritization algorithm

License

Notifications You must be signed in to change notification settings

varfish-org/cada-prio

Repository files navigation

CI codecov Documentation Status Pypi

CADA: The Next Generation

This is a re-implementation of the CADA method for phenotype-similarity prioritization.

Running Hyperparameter Tuning

Install with tune feature enabled:

pip install cada-prio[tune]

Run tuning, e.g., on the "classic" model. Thanks to optuna, you can run this in parallel as long as the database is shared. Each run will use 4 CPUs in the example below and perform 1 trial.

cada-prio tune run-optuna \
    sqlite:///local_data/cada-tune.sqlite \
    --path-hgnc-json data/classic/hgnc_complete_set.json \
    --path-hpo-genes-to-phenotype data/classic/genes_to_phenotype.all_source_all_freqs_etc.txt \
    --path-hpo-obo data/classic/hp.obo \
    --path-clinvar-phenotype-links data/classic/cases_train.jsonl \
    --path-validation-links data/classic/cases_validate.jsonl \
    --n-trials 1 \
    --cpus=4

Managing GitHub Project with Terraform

# export GITHUB_OWNER=bihealth
# export GITHUB_TOKEN=ghp_<thetoken>

# cd utils/terraform

# terraform init
# terraform import github_repository.cada-prio cada-prio
# terraform validate
# terraform fmt
# terraform plan
# terraform apply