Skip to content

This is a repository for work within the Helbig Lab research team and outside collaborators.

Notifications You must be signed in to change notification settings

EmadHassanin/hpo_sim_gene

 
 

Repository files navigation

HPO_Sim_Gene

Using the Human Phenotype Ontology (HPO) and a cohort of patients annotated with HPO terms and VCF files, these scripts find phenotype-genotype correlations. This can aid in gene discovery, treatment, and a better understanding of genes' phenotypic variability. First, using HPO terms similarity scores are found for every patient pair. Next, genes with potentially causitive variants in multiple patients are extracted from the VCF and the median similarity scores among each of these patients is calculated. Using permutation analysis of median similarity scores, p-values are assigned to each of these genes. A lower p-value potentially indicates a causitive gene.

Requirements:

  • R with packages tidyverse and memoise.

Steps to Run:

  • Clone the repository, modify the config file.

  • In the config file mention the the field output_dir, this is where your output files would be written to.

  • Source this R file and make sure the config file is in the same working directory.

Running the tests

There are test files available here: Files. Ensure that these files are linked appropriately in the config file as such:

patient_phenome : Files/example_phenome.csv  

variant_file : Files/variant.csv  

This provides the necessary VCF and cohort of patients with annotated HPO terms.

About

This is a repository for work within the Helbig Lab research team and outside collaborators.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • R 100.0%