From 430416fe01350562f748be3ea9919a74b32554fc Mon Sep 17 00:00:00 2001 From: slobentanzer Date: Thu, 8 Feb 2024 19:00:11 +0100 Subject: [PATCH] small changes --- content/20.results.md | 4 ++-- content/30.discussion.md | 2 +- content/70.supplement_vignettes.md | 2 +- 3 files changed, 4 insertions(+), 4 deletions(-) diff --git a/content/20.results.md b/content/20.results.md index 72fcb5c..5363c60 100644 --- a/content/20.results.md +++ b/content/20.results.md @@ -24,7 +24,7 @@ Functionalities include: - **benchmarking** of LLMs, prompts, and other components -- **knowledge graph querying** with automatic integration of any KG created in the BioCypher framework [@biocypher] +- **knowledge graph (KG) querying** with automatic integration of any KG created in the BioCypher framework [@biocypher] - **retrieval-augmented generation** (RAG) using vector database embeddings of user-provided literature @@ -95,7 +95,7 @@ The general instructions for both variants are the same, otherwise. ### Knowledge Graphs -Knowledge graphs (KGs) are a powerful tool to represent and query knowledge in a structured manner. +KGs are a powerful tool to represent and query knowledge in a structured manner. With BioCypher [@biocypher], we have developed a framework to create KGs from biomedical data in a user-friendly manner while also semantically grounding the data in ontologies. BioChatter is an extension of the BioCypher ecosystem, elevating its user-friendliness further by allowing natural language interactions with the data; any BioCypher KG is automatically compatible with BioChatter. We use information generated in the build process of BioCypher KGs to tune BioChatter's understanding of the data structures and contents, thereby increasing the efficiency of LLM-based KG querying (see Methods). diff --git a/content/30.discussion.md b/content/30.discussion.md index fe1a84b..6eb8b58 100644 --- a/content/30.discussion.md +++ b/content/30.discussion.md @@ -15,7 +15,7 @@ We prevent data leakage from the benchmark datasets into the training data of ne The living benchmark will be updated with new questions and tasks as they arise in the community. We facilitate access to LLMs by allowing the use of both proprietary and open-source models, and we provide a flexible deployment framework for the latter. -Proprietary models are currently the most economic solution for accessing state-of-the-art models, and as such primarily suited for users just starting out or lacking the resources to deploy their own models. +Proprietary models are currently the most economical solution for accessing state-of-the-art models, and as such primarily suited for users just starting out or lacking the resources to deploy their own models. In contrast, open-source models are quickly catching up in terms of performance [@biollmbench], and they are essential for the sustainability of the field [@doi:10.1038/d41586-024-00029-4]. We allow self-hosting of open-source models on any scale, from dedicated hardware with GPUs, to local deployment on end-user laptops, to browser-based deployment using web technology. diff --git a/content/70.supplement_vignettes.md b/content/70.supplement_vignettes.md index 71ea0b3..ac2091a 100644 --- a/content/70.supplement_vignettes.md +++ b/content/70.supplement_vignettes.md @@ -29,7 +29,7 @@ KG functionality and select how many results we want to retrieve. Returning to the conversation and enabling the KG functionality for the current chat (directly above the send button), we can then ask the model about the KG. -The languange model we use is `gpt-3.5-turbo`. The full conversation is pasted +The language model we use is `gpt-3.5-turbo`. The full conversation is pasted below, including the queries generated by BioChatter. ![KG Conversation](images/kg-demo.png)