Skip to content

Commit 832bb6f

Browse files
authored
Update README.md
1 parent 814388e commit 832bb6f

File tree

1 file changed

+8
-3
lines changed

1 file changed

+8
-3
lines changed

README.md

+8-3
Original file line numberDiff line numberDiff line change
@@ -68,7 +68,7 @@ data = {
6868
}
6969
```
7070

71-
Based on this data, one may subset the diseases in order to get a list of diseases of interest, **highly recommended at the beginning of the development of a phenotypic analysis algorithm:**
71+
Based on this data, one may subset the diseases in order to get a list of diseases of interest, **highly recommended at the beginning of the development of a phenotype-based prediction algorithm:**
7272
```python
7373
# These lines come from the previous code
7474
ann = dann.data
@@ -136,6 +136,8 @@ This module allows the creation of realistic patient profiles based on the disea
136136
1. Sample symptoms using the symptom frequency.
137137
2. From the selected symptoms, sample imprecision as a Poisson process with a certain probability of getting a less specific term using the HPO ontology.
138138
3. Add random noise sampling random HPO terms. The amount of random noise is also a Poisson process, while the selection of the HPO terms to include is uniform across the phenotypic abnormality subontology (disregarding too uninformative terms).
139+
4. Sample patient age by assuming that it is close to the disease onset plus a delay of ~1 month.
140+
5. Sample patient sex taking into account the inheritance pattern of the disease.
139141

140142
In order to sample 5 patients from a disease, run the following lines:
141143
```python
@@ -269,7 +271,10 @@ data = db.generate_list_of_dicts()
269271

270272
# Interesting publications
271273

272-
## Disease prediction from phenotypes only
274+
## Relevant publications for disease prediction based on phenotypes
275+
There are many publications exploring the prediction of having a particular rare disease based on a patient's phenotype. The phenotype analysis piece, which may or may not be the central aspect of a publication, largely falls under two categories: ontology- or representation-based algorithms. The ontology-based algorithms define a logic by which distances between terms are calculated based on their position within the ontology and on how common each of them are within the rare diseases (via the information content: *IC = -log(p)*). The representation-based algorithms compute term representation based on embeddings calculated over a specific dataset. Ideally, the dataset should consist of individual (anonymous) patients in order to gather the most granular information. In the abscence of this option it is recommended to simulated such dataset.
276+
277+
### Disease prediction from phenotypes only
273278
- Disease Prediction via Graph Neural Networks, **2021**, Sun et al. https://pubmed.ncbi.nlm.nih.gov/32749976/
274279
- Graph Neural Network-Based Diagnosis Prediction, **2020**, Li et al. https://pubmed.ncbi.nlm.nih.gov/32783631/
275280
- Phrank measures phenotype sets similarity to greatly improve Mendelian diagnostic disease prioritization, **2019**, Jagadeesh et al. https://www.nature.com/articles/s41436-018-0072-y
@@ -281,7 +286,7 @@ data = db.generate_list_of_dicts()
281286
- Bayesian ontology querying for accurate and noise-tolerant semantic searches, **2012**, Bauer et al. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3463114/
282287
- Clinical Diagnostics in Human Genetics with Semantic Similarity Searches in Ontologies, **2009**, Köhler et al. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2756558/
283288

284-
## Disease prediction from phenotypes and genetic data
289+
### Disease prediction from phenotypes and genetic data
285290
- OligoPVP: Phenotype-driven analysis of individual genomic information to prioritize oligogenic disease variants, **2018**, Boudellioua et al. https://pubmed.ncbi.nlm.nih.gov/30279426/
286291
- Phenotype-driven strategies for exome prioritization of human Mendelian disease genes, **2015**, Smedley et al. https://pubmed.ncbi.nlm.nih.gov/26229552/
287292
- Effective diagnosis of genetic disease by computational phenotype analysis of the disease-associated genome, **2014**, Zemojtel et al. https://pubmed.ncbi.nlm.nih.gov/25186178/

0 commit comments

Comments
 (0)