diff --git a/docs/404.html b/docs/404.html index ea6d270..ec6e6e4 100644 --- a/docs/404.html +++ b/docs/404.html @@ -23,7 +23,7 @@ - + @@ -148,6 +148,14 @@
  • 2.5 Bar plot of HPO in rare disease databases results
  • 2.6 Bar plot of PubTator results
  • +
  • 3 Curation of high evidence genes +
  • +
  • 4 Additional analyses +
  • References
  • diff --git a/docs/KidneyGenetics_documentation.docx b/docs/KidneyGenetics_documentation.docx index 04d6096..ccd1b9c 100644 Binary files a/docs/KidneyGenetics_documentation.docx and b/docs/KidneyGenetics_documentation.docx differ diff --git a/docs/KidneyGenetics_documentation.epub b/docs/KidneyGenetics_documentation.epub index 17cd9ad..ef60523 100644 Binary files a/docs/KidneyGenetics_documentation.epub and b/docs/KidneyGenetics_documentation.epub differ diff --git a/docs/KidneyGenetics_documentation.pdf b/docs/KidneyGenetics_documentation.pdf index febc62e..0b6381d 100644 Binary files a/docs/KidneyGenetics_documentation.pdf and b/docs/KidneyGenetics_documentation.pdf differ diff --git a/docs/additional-analyses.html b/docs/additional-analyses.html new file mode 100644 index 0000000..477f7fb --- /dev/null +++ b/docs/additional-analyses.html @@ -0,0 +1,253 @@ + + + + + + + Chapter | 4 Additional analyses | The Kidney-Genetics Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    + +
    +
    + + +
    +
    + +
    +
    +

    Chapter | 4 Additional analyses

    +
    +

    4.1 Diagnostic panels content overlap

    +
    +

    Below you can see a bar plot of the diagnostic panels content overlap.

    +
    +

    We used ten common diagnostic panels that can be ordered for kidney disease analysis and extracted the screened genes from them. +Here we show the overlap of the genes in the different panels.

    +
    + + +
    +
    +
    + +
    +
    +
    + + +
    +
    + + + + + + + + + + + + + diff --git a/docs/analyses-plots.html b/docs/analyses-plots.html index ee138af..1847d75 100644 --- a/docs/analyses-plots.html +++ b/docs/analyses-plots.html @@ -23,7 +23,7 @@ - + @@ -31,7 +31,7 @@ - + @@ -148,6 +148,14 @@
  • 2.5 Bar plot of HPO in rare disease databases results
  • 2.6 Bar plot of PubTator results
  • +
  • 3 Curation of high evidence genes +
  • +
  • 4 Additional analyses +
  • References
  • @@ -197,8 +205,8 @@

    2.2 Bar plot of PanelApp results<
  • For example 38 Genes occurred in just one panel and 2 Genes were present in all thirty different panels.
  • -
    - +
    +

    2.3 Bar plot of Literature results

    @@ -215,8 +223,8 @@

    2.3 Bar plot of Literature result
  • For example 331 Genes occurred in just one of the publications and 1 Gene was present in all 13 different publications.
  • -
    - +
    +

    2.4 Bar plot of Diagnostic panels results

    @@ -231,8 +239,8 @@

    2.4 Bar plot of Diagnostic panels
  • For example 371 Genes occurred in just one panel and 56 Genes were present in all ten different panels.
  • -
    - +
    +

    2.5 Bar plot of HPO in rare disease databases results

    @@ -249,8 +257,8 @@

    2.5 Bar plot of HPO in rare disea
  • For example 652 Genes occurred in just one database and 1 Gene was present in all eight different databases.
  • -
    - +
    +

    2.6 Bar plot of PubTator results

    @@ -265,8 +273,8 @@

    2.6 Bar plot of PubTator results<
  • For example 914 Genes occurred in just one publication and 1 Gene was present in 1221 different publications.
  • -
    - +
    +

    @@ -276,7 +284,7 @@

    2.6 Bar plot of PubTator results< - + diff --git a/docs/analyses-tables.html b/docs/analyses-tables.html index 00f64e9..960b4b8 100644 --- a/docs/analyses-tables.html +++ b/docs/analyses-tables.html @@ -23,7 +23,7 @@ - + @@ -148,6 +148,14 @@
  • 2.5 Bar plot of HPO in rare disease databases results
  • 2.6 Bar plot of PubTator results
  • +
  • 3 Curation of high evidence genes +
  • +
  • 4 Additional analyses +
  • References
  • @@ -172,38 +180,38 @@

    Chapter | 1 Analyses result table

    1.1 Main table: Merged analyses sources

    This table shows the merged results of all analyses files as a wide table with summarized information.

    -
    - +
    +

    1.2 Result table: PanelApp

    This table shows results of the first analysis searching kidney disease associated genes from the PanelApp project in the UK and Australia.

    -
    - +
    +

    1.3 Result table: Literature

    This table shows results of the second analysis searching kidney disease associated genes from various publications.

    -
    - +
    +

    1.4 Result table: Diagnostic panels

    This table shows results of the third analysis searching kidney disease associated genes from clinical diagnostic panels for kidney disease.

    -
    - +
    +

    1.5 Result table: HPO in rare disease databases

    This table shows results of the fourth analysis searching kidney disease associated genes from a Human Phenotype Ontology (HPO)-based search in rare disease databases (OMIM, Orphanet).

    -
    - +
    +

    1.6 Result table: PubTator

    This table shows results of the fifth analysis searching kidney disease associated genes from a PubTator API-based automated literature extraction from PubMed.

    -
    - +
    +
    diff --git a/docs/index.html b/docs/index.html index 52b2c5c..3086239 100644 --- a/docs/index.html +++ b/docs/index.html @@ -23,7 +23,7 @@ - + @@ -148,6 +148,14 @@
  • 2.5 Bar plot of HPO in rare disease databases results
  • 2.6 Bar plot of PubTator results
  • +
  • 3 Curation of high evidence genes +
  • +
  • 4 Additional analyses +
  • References
  • @@ -169,7 +177,7 @@

    Preface

    diff --git a/docs/manual-curation.html b/docs/manual-curation.html new file mode 100644 index 0000000..7138825 --- /dev/null +++ b/docs/manual-curation.html @@ -0,0 +1,249 @@ + + + + + + + Chapter | 3 Curation of high evidence genes | The Kidney-Genetics Documentation + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + +
    +
    + + +
    +
    + +
    +
    +

    Chapter | 3 Curation of high evidence genes

    +
    +

    3.1 Table of high evidence genes

    +

    This table shows the annotated high evidence genes.

    +
    + + +
    +
    +
    + +
    +
    +
    + + +
    +
    + + + + + + + + + + + + + diff --git a/docs/reference-keys.txt b/docs/reference-keys.txt index baf3ec0..3708e67 100644 --- a/docs/reference-keys.txt +++ b/docs/reference-keys.txt @@ -12,3 +12,7 @@ bar-plot-of-literature-results bar-plot-of-diagnostic-panels-results bar-plot-of-hpo-in-rare-disease-databases-results bar-plot-of-pubtator-results +manual-curation +table-of-high-evidence-genes +additional-analyses +diagnostic-panels-content-overlap diff --git a/docs/references.html b/docs/references.html index 252b8db..331902d 100644 --- a/docs/references.html +++ b/docs/references.html @@ -23,14 +23,14 @@ - + - + @@ -148,6 +148,14 @@
  • 2.5 Bar plot of HPO in rare disease databases results
  • 2.6 Bar plot of PubTator results
  • +
  • 3 Curation of high evidence genes +
  • +
  • 4 Additional analyses +
  • References
  • @@ -189,7 +197,7 @@

    References

    - + diff --git a/docs/search_index.json b/docs/search_index.json index 6c1e0e7..70cf25c 100644 --- a/docs/search_index.json +++ b/docs/search_index.json @@ -1 +1 @@ -[["index.html", "The Kidney-Genetics Documentation Preface Objective Methods Results Conclusion Outlook", " The Kidney-Genetics Documentation Bernt Popp, Nina Rank, Constantin Wolff, Jan Halbritter 2023-10-05 Preface This documentation is intended to describe the Kidney-Genetics project. Objective How can we address the lack of a unified and standardized database of kidney disease-associated genes, which hampers diagnosis, treatment, and research comparability in the field of kidney diseases? Genetic insights are becoming increasingly influential in the understanding and treatment of various kidney diseases (KD). Hundreds of genes associated with monogenic kidney disease have been identified, providing valuable insights into their diagnosis, management, and monitoring. However, the lack of a unified and standardized database of genes assigned to kidney diseases has led to diagnostic blind spots and comparability issues among current studies of kidney genetics. To address this gap, we created the “Kidney-Genetics” a regularly updated, automated and publicly accessible database which aims to provide a comprehensive list of all relevant genes associated with kidney disease. Key issues: Create a unified and standardized database of kidney disease-associated genes and provide a valuable resource for the diagnosis, treatment, and monitoring of those diseases Allow clinicians and researchers to gain a deeper understanding of the genetic factors underlying different KDs Compile, organize and curate important information on the genes to the identify novel candidate genes and genetic variants associated with KDs Group and sort the genes into different categories, for example into phenotypic groups, the onset, syndromic, etc. Establish genotype-phenotype correlations that can be used to assign multiple clinical entities to a single gene in order to improve understanding and treatment choices The information can be used to develop personalized treatment strategies and interventions, leading to more effective and targeted therapies for individuals with KD Researchers can freely access “Kidney-Genetics” ensuring consistency and comparability across different research projects, which can accelerate scientific progress, foster collaborations, and facilitate the development of new insights and approaches The scientific literature highlights the need for such a database and emphasizes the importance of genetic research in kidney disease (e.g. (Boulogne et al., 2023)). In summary, our research question and its approach have the potential to provide a deeper scientific understanding of KD genetics, improve diagnostic accuracy, guide treatment selection, advance precision medicine, and facilitate research collaboration. The establishment of the “Kidney-Genetics” database addresses an important gap in the field and provides a valuable resource for researchers, clinicians, and patients involved in the discovery and treatment of KD. Methods To create a thorough and standardized database of kidney-related genes, we employed the following methods and compiled kidney disease-associated gene information from various sources: Utilized data from Genomics England and Australia PanelApp (Martin et al., 2019) Conducted a comprehensive literature review of published gene lists Collected information from clinical diagnostic panels for kidney disease Performed a Human Phenotype Ontology (HPO)-based (Köhler et al., 2021)) search in rare disease databases (OMIM) Employed a PubTator (Wei et al., 2013) API-based automated literature extraction from PubMed We also developed an evidence-scoring system to differentiate highly confirmed disease genes from candidate genes. We defined the presence of a certain gene in 3 or more of the 5 resources as highly evident genes. These genes were then manually curated according to predetermined criteria or, in the case of existing ClinGen curation, their data and scores were used. Genes with a score of 2 or less were accordingly more likely to be classified as candidate genes. Furthermore, we grouped all genes into different categories to later match them in a genotype-phenotype correlation. To get a more transparent and thus more comprehensive understanding of our several evidence source “pillars”, we listed our different steps below and attached a flowchart for better visualization. We retrieved all kidney disease related panels from both PanelApp UK and PanelApp Australia, meaning all panels that include “renal” or “kidney” in its name. That included xxx different lists. The access date was the xxx. We identified Genes associated with kidney disease in a systematic Literature search using the following search query: (1) “Kidney”[Mesh] OR “Kidney Diseases”[Mesh] OR kidney OR renal AND (2) “Genetic Structures”[Mesh] OR “Genes”[Mesh] OR genetic test OR gene panel OR gene panels OR multigene panel OR targeted panel* we then screened for published lists and got xxx lists from date to date xxx. We used ten common diagnostic panels that can be purchased for genome analysis and extracted the screened genes from them. Those included following panels: Centogene nephrology Cegat kidney diseases Preventiongenetics etc. We used common databases (e.g. OMIM) for rare diseases and screened them for kidney disease associated Genes from a Human Phenotype Ontology (HPO) based search query. The most comprehensive HPO term used was “Abnormality of the upper urinary tract” (HP:0010935) and included all subgroup terms. We deliberately chose these to be somewhat broader in order to fully include all relevant kidney diseases such as CAKUT, among others. We retrieved all kidney disease associated genes from a PubTator API-based automated literature extraction of publications available on PubMed. Kidney-Genetics Flowchart (#fig:curation_flow_diagram)Curation process flow diagram Results The “Kidney-Genetics” database currently contains detailed information on 3001 kidney-associated genes with detailed annotations on gene function, kidney phenotype, incidence, possible syndromic disease expression and genetic variation. To automatically group the genes, we will present the results of phenotypic and functional clustering. The number of genes extracted from the five analyzed sources of information is as follows: (1) 550, (2) 822, (3) 936, (4) 791, and (5) 2133 Notably, 437 genes (14.6%) of the total 3001 genes are present in three or more of the analyzed information sources, thus meeting our evidence criteria, indicating high confidence and their potential for diagnostic use. Of these high evidence genes, 423 (96.8%) are present in at least one, and 56 (12.8%) are present in all 10 comprehensive diagnostic laboratory panels. To ensure currency, Kidney-Genetics will be updated regularly and automatically at XXX week intervals. We will also provide phenotypic and functional clustering results to facilitate gene grouping. Conclusion Kidney-Genetics is a comprehensive, free and publicly accessible database that can be used by researchers to analyze genomic data related to KDs. The database will be routinely updated using an automated system and standardized pipeline to ensure that it is always up-to-date with the latest kidney research and diagnostics. By utilizing Kidney-Genetics, clinicians, geneticists, and researchers can examine genomic data and improve their understanding of the genetic components of diverse KDs. The code and results are completely available on GitHub. A standardized pipeline and automated system keep our database on the cutting edge of kidney research and diagnostics. Screening efforts toward manual curation (such as through the ClinGen initiative) and assignment of diagnostic genes to nephrologic disease groups (e.g., syndromic vs. isolated; adult vs. pediatric; cystic, nephrotic, etc.) are currently in the development process and our goals for the near future. Outlook Future goals include the further manual curation of the high evident genes to acquire a more accurate individual assessment of each gene. For this purpose, we have developed a standardized curation process based on the ClinGen criteria, as previously discussed in the methods section. Furthermore, diagnostic genes will be assigned to certain defined nephrological disease groups, in order to obtain a phenotype-genotype correlation and gain a better clinical understanding. References Boulogne, F., Claus, L. R., Wiersma, H., Oelen, R., Schukking, F., Klein, N. de, Li, S., Westra, H.-J., Zwaag, B. van der, Reekum, F. van, Genomics England Research Consortium, Sierks, D., Schönauer, R., Li, Z., Bijlsma, E. K., Bos, W. J. W., Halbritter, J., Knoers, N. V. A. M., Besse, W., … Eerde, A. M. van. (2023). KidneyNetwork: Using kidney-derived gene expression data to predict and prioritize novel genes involved in kidney disease. European Journal of Human Genetics: EJHG. https://doi.org/10.1038/s41431-023-01296-x Köhler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Balagura, G., Baynam, G., Brower, A. M., Callahan, T. J., Chute, C. G., Est, J. L., Galer, P. D., Ganesan, S., Griese, M., Haimel, M., Pazmandi, J., Hanauer, M., … Robinson, P. N. (2021). The Human Phenotype Ontology in 2021. Nucleic Acids Research, 49(D1), D1207–D1217. https://doi.org/10.1093/nar/gkaa1043 Martin, A. R., Williams, E., Foulger, R. E., Leigh, S., Daugherty, L. C., Niblock, O., Leong, I. U. S., Smith, K. R., Gerasimenko, O., Haraldsdottir, E., Thomas, E., Scott, R. H., Baple, E., Tucci, A., Brittain, H., De Burca, A., Ibañez, K., Kasperaviciute, D., Smedley, D., … McDonagh, E. M. (2019). PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nature Genetics, 51(11), 1560–1565. https://doi.org/10.1038/s41588-019-0528-2 Wei, C.-H., Kao, H.-Y., & Lu, Z. (2013). PubTator: A web-based text mining tool for assisting biocuration. Nucleic Acids Research, 41(W1), W518–W522. https://doi.org/10.1093/nar/gkt441 "],["analyses-tables.html", "Chapter | 1 Analyses result tables 1.1 Main table: Merged analyses sources 1.2 Result table: PanelApp 1.3 Result table: Literature 1.4 Result table: Diagnostic panels 1.5 Result table: HPO in rare disease databases 1.6 Result table: PubTator", " Chapter | 1 Analyses result tables 1.1 Main table: Merged analyses sources This table shows the merged results of all analyses files as a wide table with summarized information. 1.2 Result table: PanelApp This table shows results of the first analysis searching kidney disease associated genes from the PanelApp project in the UK and Australia. 1.3 Result table: Literature This table shows results of the second analysis searching kidney disease associated genes from various publications. 1.4 Result table: Diagnostic panels This table shows results of the third analysis searching kidney disease associated genes from clinical diagnostic panels for kidney disease. 1.5 Result table: HPO in rare disease databases This table shows results of the fourth analysis searching kidney disease associated genes from a Human Phenotype Ontology (HPO)-based search in rare disease databases (OMIM, Orphanet). 1.6 Result table: PubTator This table shows results of the fifth analysis searching kidney disease associated genes from a PubTator API-based automated literature extraction from PubMed. "],["analyses-plots.html", "Chapter | 2 Analyses plots 2.1 UpSet plot of merged analyses sources 2.2 Bar plot of PanelApp results 2.3 Bar plot of Literature results 2.4 Bar plot of Diagnostic panels results 2.5 Bar plot of HPO in rare disease databases results 2.6 Bar plot of PubTator results", " Chapter | 2 Analyses plots 2.1 UpSet plot of merged analyses sources Below you can see a UpSet plot of the merged analyses. In the lower left corner you can see the number of Genes originating from each of the different resources, after that resources are sorted on the right side. UpSet plots generally represent the intersections of a data set in the form of a matrix, as can be seen in the graph below. Each column corresponds to a set, and the bar graphs at the top show the size of the set. Each row corresponds to a possible intersection: the dark filled circles show which set is part of an intersection. For example, the first column shows that most of the genes found in only one of the five sources are derived from the PubTator query, and in the third column you can see that 177 Genes are found in all five sources. 2.2 Bar plot of PanelApp results Below you can see a Bar plot of the PanelApp analysis. We retrieved all kidney disease related panels from both PanelApp UK and PanelApp Australia, meaning all panels that include “renal” or “kidney” in its name. The y axis shows the number of Genes in the different panels, which is also visualized by the height of the bars. The x axis displays the number of panels (source_count), i.e. in how many different panels a single Gene occurred. For example 38 Genes occurred in just one panel and 2 Genes were present in all thirty different panels. 2.3 Bar plot of Literature results Below you can see a Bar plot of the Literature analysis. We identified Genes associated with kidney disease in a systematic Literature search using the following search query: (1) “Kidney”[Mesh] OR “Kidney Diseases”[Mesh] OR kidney OR renal AND (2) “Genetic Structures”[Mesh] OR “Genes”[Mesh] OR genetic test OR gene panel OR gene panels OR multigene panel OR targeted panel* The y axis shows the number of Genes in different publications, which is also visualized by the height of the bars. The x axis displays the number of publications (source_count), i.e. in how many different publications a single Gene occurred. For example 331 Genes occurred in just one of the publications and 1 Gene was present in all 13 different publications. 2.4 Bar plot of Diagnostic panels results Below you can see a Bar plot of the Diagnostic panels analysis. We used ten common diagnostic panels that can be purchased for genome analysis and extracted the screened Genes from them. The y axis shows the number of Genes in the different diagnostic panels, which is also visualized by the height of the bars. The x axis displays the number of panels (source_count), i.e. in how many different panels a single Gene occurred. For example 371 Genes occurred in just one panel and 56 Genes were present in all ten different panels. 2.5 Bar plot of HPO in rare disease databases results Below you can see a Bar plot of the HPO-term based query in rare disease databases (OMIM, Orphanet). We used eight common databases for rare diseases and screened them for kidney disease associated Genes from a Human Phenotype Ontology (HPO) based search query. The most comprehensive HPO term used was “Abnormality of the upper urinary tract” (HP:0010935) and included all sub group terms. We deliberately chose these to be somewhat broader in order to fully include all relevant kidney diseases such as CAKUT, among others. The y axis shows the number of Genes in the different rare disease databases, which is also visualized by the height of the bars. The x axis displays the number of databases (source_count), i.e. in how many different databases a single Gene occurred. For example 652 Genes occurred in just one database and 1 Gene was present in all eight different databases. 2.6 Bar plot of PubTator results Below you can see a Bar plot of the PubTator analysis. We retrieved all kidney disease associated Genes from a PubTator API-based automated literature extraction of publications available on PubMed. The y axis shows the number of Genes in the different publications, which is also visualized by the height of the bars. The x axis displays the number of publications (source_count), i.e. in how many different publications a single Gene occurred. For example 914 Genes occurred in just one publication and 1 Gene was present in 1221 different publications. "],["references.html", "References", " References Boulogne, F., Claus, L. R., Wiersma, H., Oelen, R., Schukking, F., Klein, N. de, Li, S., Westra, H.-J., Zwaag, B. van der, Reekum, F. van, Genomics England Research Consortium, Sierks, D., Schönauer, R., Li, Z., Bijlsma, E. K., Bos, W. J. W., Halbritter, J., Knoers, N. V. A. M., Besse, W., … Eerde, A. M. van. (2023). KidneyNetwork: Using kidney-derived gene expression data to predict and prioritize novel genes involved in kidney disease. European Journal of Human Genetics: EJHG. https://doi.org/10.1038/s41431-023-01296-x Köhler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Balagura, G., Baynam, G., Brower, A. M., Callahan, T. J., Chute, C. G., Est, J. L., Galer, P. D., Ganesan, S., Griese, M., Haimel, M., Pazmandi, J., Hanauer, M., … Robinson, P. N. (2021). The Human Phenotype Ontology in 2021. Nucleic Acids Research, 49(D1), D1207–D1217. https://doi.org/10.1093/nar/gkaa1043 Martin, A. R., Williams, E., Foulger, R. E., Leigh, S., Daugherty, L. C., Niblock, O., Leong, I. U. S., Smith, K. R., Gerasimenko, O., Haraldsdottir, E., Thomas, E., Scott, R. H., Baple, E., Tucci, A., Brittain, H., De Burca, A., Ibañez, K., Kasperaviciute, D., Smedley, D., … McDonagh, E. M. (2019). PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nature Genetics, 51(11), 1560–1565. https://doi.org/10.1038/s41588-019-0528-2 Wei, C.-H., Kao, H.-Y., & Lu, Z. (2013). PubTator: A web-based text mining tool for assisting biocuration. Nucleic Acids Research, 41(W1), W518–W522. https://doi.org/10.1093/nar/gkt441 "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] +[["index.html", "The Kidney-Genetics Documentation Preface Objective Methods Results Conclusion Outlook", " The Kidney-Genetics Documentation Bernt Popp, Nina Rank, Constantin Wolff, Jan Halbritter 2023-10-11 Preface This documentation is intended to describe the Kidney-Genetics project. Objective How can we address the lack of a unified and standardized database of kidney disease-associated genes, which hampers diagnosis, treatment, and research comparability in the field of kidney diseases? Genetic insights are becoming increasingly influential in the understanding and treatment of various kidney diseases (KD). Hundreds of genes associated with monogenic kidney disease have been identified, providing valuable insights into their diagnosis, management, and monitoring. However, the lack of a unified and standardized database of genes assigned to kidney diseases has led to diagnostic blind spots and comparability issues among current studies of kidney genetics. To address this gap, we created the “Kidney-Genetics” a regularly updated, automated and publicly accessible database which aims to provide a comprehensive list of all relevant genes associated with kidney disease. Key issues: Create a unified and standardized database of kidney disease-associated genes and provide a valuable resource for the diagnosis, treatment, and monitoring of those diseases Allow clinicians and researchers to gain a deeper understanding of the genetic factors underlying different KDs Compile, organize and curate important information on the genes to the identify novel candidate genes and genetic variants associated with KDs Group and sort the genes into different categories, for example into phenotypic groups, the onset, syndromic, etc. Establish genotype-phenotype correlations that can be used to assign multiple clinical entities to a single gene in order to improve understanding and treatment choices The information can be used to develop personalized treatment strategies and interventions, leading to more effective and targeted therapies for individuals with KD Researchers can freely access “Kidney-Genetics” ensuring consistency and comparability across different research projects, which can accelerate scientific progress, foster collaborations, and facilitate the development of new insights and approaches The scientific literature highlights the need for such a database and emphasizes the importance of genetic research in kidney disease (e.g. (Boulogne et al., 2023)). In summary, our research question and its approach have the potential to provide a deeper scientific understanding of KD genetics, improve diagnostic accuracy, guide treatment selection, advance precision medicine, and facilitate research collaboration. The establishment of the “Kidney-Genetics” database addresses an important gap in the field and provides a valuable resource for researchers, clinicians, and patients involved in the discovery and treatment of KD. Methods To create a thorough and standardized database of kidney-related genes, we employed the following methods and compiled kidney disease-associated gene information from various sources: Utilized data from Genomics England and Australia PanelApp (Martin et al., 2019) Conducted a comprehensive literature review of published gene lists Collected information from clinical diagnostic panels for kidney disease Performed a Human Phenotype Ontology (HPO)-based (Köhler et al., 2021)) search in rare disease databases (OMIM) Employed a PubTator (Wei et al., 2013) API-based automated literature extraction from PubMed We also developed an evidence-scoring system to differentiate highly confirmed disease genes from candidate genes. We defined the presence of a certain gene in 3 or more of the 5 resources as highly evident genes. These genes were then manually curated according to predetermined criteria or, in the case of existing ClinGen curation, their data and scores were used. Genes with a score of 2 or less were accordingly more likely to be classified as candidate genes. Furthermore, we grouped all genes into different categories to later match them in a genotype-phenotype correlation. To get a more transparent and thus more comprehensive understanding of our several evidence source “pillars”, we listed our different steps below and attached a flowchart for better visualization. We retrieved all kidney disease related panels from both PanelApp UK and PanelApp Australia, meaning all panels that include “renal” or “kidney” in its name. That included xxx different lists. The access date was the xxx. We identified Genes associated with kidney disease in a systematic Literature search using the following search query: (1) “Kidney”[Mesh] OR “Kidney Diseases”[Mesh] OR kidney OR renal AND (2) “Genetic Structures”[Mesh] OR “Genes”[Mesh] OR genetic test OR gene panel OR gene panels OR multigene panel OR targeted panel* we then screened for published lists and got xxx lists from date to date xxx. We used ten common diagnostic panels that can be purchased for genome analysis and extracted the screened genes from them. Those included following panels: Centogene nephrology Cegat kidney diseases Preventiongenetics etc. We used common databases (e.g. OMIM) for rare diseases and screened them for kidney disease associated Genes from a Human Phenotype Ontology (HPO) based search query. The most comprehensive HPO term used was “Abnormality of the upper urinary tract” (HP:0010935) and included all subgroup terms. We deliberately chose these to be somewhat broader in order to fully include all relevant kidney diseases such as CAKUT, among others. We retrieved all kidney disease associated genes from a PubTator API-based automated literature extraction of publications available on PubMed. Kidney-Genetics Flowchart (#fig:curation_flow_diagram)Curation process flow diagram Results The “Kidney-Genetics” database currently contains detailed information on 3001 kidney-associated genes with detailed annotations on gene function, kidney phenotype, incidence, possible syndromic disease expression and genetic variation. To automatically group the genes, we will present the results of phenotypic and functional clustering. The number of genes extracted from the five analyzed sources of information is as follows: (1) 550, (2) 822, (3) 936, (4) 791, and (5) 2133 Notably, 437 genes (14.6%) of the total 3001 genes are present in three or more of the analyzed information sources, thus meeting our evidence criteria, indicating high confidence and their potential for diagnostic use. Of these high evidence genes, 423 (96.8%) are present in at least one, and 56 (12.8%) are present in all 10 comprehensive diagnostic laboratory panels. To ensure currency, Kidney-Genetics will be updated regularly and automatically at XXX week intervals. We will also provide phenotypic and functional clustering results to facilitate gene grouping. Conclusion Kidney-Genetics is a comprehensive, free and publicly accessible database that can be used by researchers to analyze genomic data related to KDs. The database will be routinely updated using an automated system and standardized pipeline to ensure that it is always up-to-date with the latest kidney research and diagnostics. By utilizing Kidney-Genetics, clinicians, geneticists, and researchers can examine genomic data and improve their understanding of the genetic components of diverse KDs. The code and results are completely available on GitHub. A standardized pipeline and automated system keep our database on the cutting edge of kidney research and diagnostics. Screening efforts toward manual curation (such as through the ClinGen initiative) and assignment of diagnostic genes to nephrologic disease groups (e.g., syndromic vs. isolated; adult vs. pediatric; cystic, nephrotic, etc.) are currently in the development process and our goals for the near future. Outlook Future goals include the further manual curation of the high evident genes to acquire a more accurate individual assessment of each gene. For this purpose, we have developed a standardized curation process based on the ClinGen criteria, as previously discussed in the methods section. Furthermore, diagnostic genes will be assigned to certain defined nephrological disease groups, in order to obtain a phenotype-genotype correlation and gain a better clinical understanding. References Boulogne, F., Claus, L. R., Wiersma, H., Oelen, R., Schukking, F., Klein, N. de, Li, S., Westra, H.-J., Zwaag, B. van der, Reekum, F. van, Genomics England Research Consortium, Sierks, D., Schönauer, R., Li, Z., Bijlsma, E. K., Bos, W. J. W., Halbritter, J., Knoers, N. V. A. M., Besse, W., … Eerde, A. M. van. (2023). KidneyNetwork: Using kidney-derived gene expression data to predict and prioritize novel genes involved in kidney disease. European Journal of Human Genetics: EJHG. https://doi.org/10.1038/s41431-023-01296-x Köhler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Balagura, G., Baynam, G., Brower, A. M., Callahan, T. J., Chute, C. G., Est, J. L., Galer, P. D., Ganesan, S., Griese, M., Haimel, M., Pazmandi, J., Hanauer, M., … Robinson, P. N. (2021). The Human Phenotype Ontology in 2021. Nucleic Acids Research, 49(D1), D1207–D1217. https://doi.org/10.1093/nar/gkaa1043 Martin, A. R., Williams, E., Foulger, R. E., Leigh, S., Daugherty, L. C., Niblock, O., Leong, I. U. S., Smith, K. R., Gerasimenko, O., Haraldsdottir, E., Thomas, E., Scott, R. H., Baple, E., Tucci, A., Brittain, H., De Burca, A., Ibañez, K., Kasperaviciute, D., Smedley, D., … McDonagh, E. M. (2019). PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nature Genetics, 51(11), 1560–1565. https://doi.org/10.1038/s41588-019-0528-2 Wei, C.-H., Kao, H.-Y., & Lu, Z. (2013). PubTator: A web-based text mining tool for assisting biocuration. Nucleic Acids Research, 41(W1), W518–W522. https://doi.org/10.1093/nar/gkt441 "],["analyses-tables.html", "Chapter | 1 Analyses result tables 1.1 Main table: Merged analyses sources 1.2 Result table: PanelApp 1.3 Result table: Literature 1.4 Result table: Diagnostic panels 1.5 Result table: HPO in rare disease databases 1.6 Result table: PubTator", " Chapter | 1 Analyses result tables 1.1 Main table: Merged analyses sources This table shows the merged results of all analyses files as a wide table with summarized information. 1.2 Result table: PanelApp This table shows results of the first analysis searching kidney disease associated genes from the PanelApp project in the UK and Australia. 1.3 Result table: Literature This table shows results of the second analysis searching kidney disease associated genes from various publications. 1.4 Result table: Diagnostic panels This table shows results of the third analysis searching kidney disease associated genes from clinical diagnostic panels for kidney disease. 1.5 Result table: HPO in rare disease databases This table shows results of the fourth analysis searching kidney disease associated genes from a Human Phenotype Ontology (HPO)-based search in rare disease databases (OMIM, Orphanet). 1.6 Result table: PubTator This table shows results of the fifth analysis searching kidney disease associated genes from a PubTator API-based automated literature extraction from PubMed. "],["analyses-plots.html", "Chapter | 2 Analyses plots 2.1 UpSet plot of merged analyses sources 2.2 Bar plot of PanelApp results 2.3 Bar plot of Literature results 2.4 Bar plot of Diagnostic panels results 2.5 Bar plot of HPO in rare disease databases results 2.6 Bar plot of PubTator results", " Chapter | 2 Analyses plots 2.1 UpSet plot of merged analyses sources Below you can see a UpSet plot of the merged analyses. In the lower left corner you can see the number of Genes originating from each of the different resources, after that resources are sorted on the right side. UpSet plots generally represent the intersections of a data set in the form of a matrix, as can be seen in the graph below. Each column corresponds to a set, and the bar graphs at the top show the size of the set. Each row corresponds to a possible intersection: the dark filled circles show which set is part of an intersection. For example, the first column shows that most of the genes found in only one of the five sources are derived from the PubTator query, and in the third column you can see that 177 Genes are found in all five sources. 2.2 Bar plot of PanelApp results Below you can see a Bar plot of the PanelApp analysis. We retrieved all kidney disease related panels from both PanelApp UK and PanelApp Australia, meaning all panels that include “renal” or “kidney” in its name. The y axis shows the number of Genes in the different panels, which is also visualized by the height of the bars. The x axis displays the number of panels (source_count), i.e. in how many different panels a single Gene occurred. For example 38 Genes occurred in just one panel and 2 Genes were present in all thirty different panels. 2.3 Bar plot of Literature results Below you can see a Bar plot of the Literature analysis. We identified Genes associated with kidney disease in a systematic Literature search using the following search query: (1) “Kidney”[Mesh] OR “Kidney Diseases”[Mesh] OR kidney OR renal AND (2) “Genetic Structures”[Mesh] OR “Genes”[Mesh] OR genetic test OR gene panel OR gene panels OR multigene panel OR targeted panel* The y axis shows the number of Genes in different publications, which is also visualized by the height of the bars. The x axis displays the number of publications (source_count), i.e. in how many different publications a single Gene occurred. For example 331 Genes occurred in just one of the publications and 1 Gene was present in all 13 different publications. 2.4 Bar plot of Diagnostic panels results Below you can see a Bar plot of the Diagnostic panels analysis. We used ten common diagnostic panels that can be purchased for genome analysis and extracted the screened Genes from them. The y axis shows the number of Genes in the different diagnostic panels, which is also visualized by the height of the bars. The x axis displays the number of panels (source_count), i.e. in how many different panels a single Gene occurred. For example 371 Genes occurred in just one panel and 56 Genes were present in all ten different panels. 2.5 Bar plot of HPO in rare disease databases results Below you can see a Bar plot of the HPO-term based query in rare disease databases (OMIM, Orphanet). We used eight common databases for rare diseases and screened them for kidney disease associated Genes from a Human Phenotype Ontology (HPO) based search query. The most comprehensive HPO term used was “Abnormality of the upper urinary tract” (HP:0010935) and included all sub group terms. We deliberately chose these to be somewhat broader in order to fully include all relevant kidney diseases such as CAKUT, among others. The y axis shows the number of Genes in the different rare disease databases, which is also visualized by the height of the bars. The x axis displays the number of databases (source_count), i.e. in how many different databases a single Gene occurred. For example 652 Genes occurred in just one database and 1 Gene was present in all eight different databases. 2.6 Bar plot of PubTator results Below you can see a Bar plot of the PubTator analysis. We retrieved all kidney disease associated Genes from a PubTator API-based automated literature extraction of publications available on PubMed. The y axis shows the number of Genes in the different publications, which is also visualized by the height of the bars. The x axis displays the number of publications (source_count), i.e. in how many different publications a single Gene occurred. For example 914 Genes occurred in just one publication and 1 Gene was present in 1221 different publications. "],["manual-curation.html", "Chapter | 3 Curation of high evidence genes 3.1 Table of high evidence genes", " Chapter | 3 Curation of high evidence genes 3.1 Table of high evidence genes This table shows the annotated high evidence genes. "],["additional-analyses.html", "Chapter | 4 Additional analyses 4.1 Diagnostic panels content overlap", " Chapter | 4 Additional analyses 4.1 Diagnostic panels content overlap Below you can see a bar plot of the diagnostic panels content overlap. We used ten common diagnostic panels that can be ordered for kidney disease analysis and extracted the screened genes from them. Here we show the overlap of the genes in the different panels. "],["references.html", "References", " References Boulogne, F., Claus, L. R., Wiersma, H., Oelen, R., Schukking, F., Klein, N. de, Li, S., Westra, H.-J., Zwaag, B. van der, Reekum, F. van, Genomics England Research Consortium, Sierks, D., Schönauer, R., Li, Z., Bijlsma, E. K., Bos, W. J. W., Halbritter, J., Knoers, N. V. A. M., Besse, W., … Eerde, A. M. van. (2023). KidneyNetwork: Using kidney-derived gene expression data to predict and prioritize novel genes involved in kidney disease. European Journal of Human Genetics: EJHG. https://doi.org/10.1038/s41431-023-01296-x Köhler, S., Gargano, M., Matentzoglu, N., Carmody, L. C., Lewis-Smith, D., Vasilevsky, N. A., Danis, D., Balagura, G., Baynam, G., Brower, A. M., Callahan, T. J., Chute, C. G., Est, J. L., Galer, P. D., Ganesan, S., Griese, M., Haimel, M., Pazmandi, J., Hanauer, M., … Robinson, P. N. (2021). The Human Phenotype Ontology in 2021. Nucleic Acids Research, 49(D1), D1207–D1217. https://doi.org/10.1093/nar/gkaa1043 Martin, A. R., Williams, E., Foulger, R. E., Leigh, S., Daugherty, L. C., Niblock, O., Leong, I. U. S., Smith, K. R., Gerasimenko, O., Haraldsdottir, E., Thomas, E., Scott, R. H., Baple, E., Tucci, A., Brittain, H., De Burca, A., Ibañez, K., Kasperaviciute, D., Smedley, D., … McDonagh, E. M. (2019). PanelApp crowdsources expert knowledge to establish consensus diagnostic gene panels. Nature Genetics, 51(11), 1560–1565. https://doi.org/10.1038/s41588-019-0528-2 Wei, C.-H., Kao, H.-Y., & Lu, Z. (2013). PubTator: A web-based text mining tool for assisting biocuration. Nucleic Acids Research, 41(W1), W518–W522. https://doi.org/10.1093/nar/gkt441 "],["404.html", "Page not found", " Page not found The page you requested cannot be found (perhaps it was moved or renamed). You may want to try searching to find the page's new location, or use the table of contents to find the page you are looking for. "]] diff --git a/edit_docs/KidneyGenetics_documentation.log b/edit_docs/KidneyGenetics_documentation.log index 679b710..fddbdd2 100644 --- a/edit_docs/KidneyGenetics_documentation.log +++ b/edit_docs/KidneyGenetics_documentation.log @@ -1,4 +1,4 @@ -This is XeTeX, Version 3.141592653-2.6-0.999995 (MiKTeX 23.5) (preloaded format=xelatex 2023.6.25) 5 OCT 2023 16:33 +This is XeTeX, Version 3.141592653-2.6-0.999995 (MiKTeX 23.5) (preloaded format=xelatex 2023.6.25) 11 OCT 2023 15:13 entering extended mode restricted \write18 enabled. %&-line parsing enabled. @@ -999,21 +999,22 @@ phic file (type pdf) File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-6-1.pdf Gra phic file (type pdf) +[6] File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-7-1.pdf Gra phic file (type pdf) -[6] File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-8-1.pdf Gra phic file (type pdf) +[7] File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-9-1.pdf Gra phic file (type pdf) -[7] +[8] File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-10-1.pdf Gr aphic file (type pdf) -[8] +[9] Underfull \hbox (badness 10000) in paragraph at lines 315--317 [] @@ -1026,7 +1027,7 @@ Underfull \hbox (badness 10000) in paragraph at lines 317--319 File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-11-1.pdf Gr aphic file (type pdf) - +[10] Underfull \hbox (badness 10000) in paragraph at lines 336--338 [] @@ -1039,7 +1040,7 @@ Underfull \hbox (badness 10000) in paragraph at lines 338--340 File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-12-1.pdf Gr aphic file (type pdf) -[9] +[11] Underfull \hbox (badness 10000) in paragraph at lines 359--361 [] @@ -1052,7 +1053,7 @@ Underfull \hbox (badness 10000) in paragraph at lines 361--363 File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-13-1.pdf Gr aphic file (type pdf) -[10] +[12] Underfull \hbox (badness 10000) in paragraph at lines 380--382 [] @@ -1065,7 +1066,7 @@ Underfull \hbox (badness 10000) in paragraph at lines 382--384 File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-14-1.pdf Gr aphic file (type pdf) -[11] +[13] Underfull \hbox (badness 10000) in paragraph at lines 403--405 [] @@ -1078,7 +1079,7 @@ Underfull \hbox (badness 10000) in paragraph at lines 405--407 File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-15-1.pdf Gr aphic file (type pdf) -[12] +[14] Underfull \hbox (badness 10000) in paragraph at lines 424--426 [] @@ -1091,7 +1092,14 @@ Underfull \hbox (badness 10000) in paragraph at lines 426--428 File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-16-1.pdf Gr aphic file (type pdf) -(KidneyGenetics_documentation.bbl [13] [14]) [15] (KidneyGenetics_documentation +[15] +File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-17-1.pdf Gr +aphic file (type pdf) + +File: KidneyGenetics_documentation_files/figure-latex/unnamed-chunk-18-1.pdf Gr +aphic file (type pdf) + +(KidneyGenetics_documentation.bbl [16] [17]) [18] (KidneyGenetics_documentation .aux) *********** LaTeX2e <2023-06-01> patch level 1 @@ -1099,12 +1107,12 @@ L3 programming layer <2023-06-16> *********** ) Here is how much of TeX's memory you used: - 19869 strings out of 411164 - 362299 string characters out of 5809999 - 1861394 words of memory out of 5000000 - 40148 multiletter control sequences out of 15000+600000 + 19899 strings out of 411164 + 364006 string characters out of 5809999 + 1863394 words of memory out of 5000000 + 40176 multiletter control sequences out of 15000+600000 519822 words of font info for 85 fonts, out of 8000000 for 9000 1348 hyphenation exceptions out of 8191 90i,8n,120p,1027b,547s stack positions out of 10000i,1000n,20000p,200000b,200000s -Output written on KidneyGenetics_documentation.pdf (15 pages). +Output written on KidneyGenetics_documentation.pdf (18 pages).