Ontotext-AD · nelly-hateva · Nov 1, 2024 · Oct 31, 2024
diff --git a/.gitignore b/.gitignore
@@ -1,2 +1,3 @@
 *.iml
 .idea/
+.ipynb_checkpoints/
diff --git a/Dockerfile b/Dockerfile
@@ -1,7 +1,7 @@
-FROM ontotext/graphdb:10.5.1
+FROM ontotext/graphdb:10.7.6
 RUN mkdir -p /opt/graphdb/dist/data/repositories/langchain
 COPY config.ttl /opt/graphdb/dist/data/repositories/langchain/
 COPY starwars-data.trig /
 COPY rdfs.ttl /
 COPY graphdb_create.sh /run.sh
-ENTRYPOINT bash /run.sh
+ENTRYPOINT ["bash", "/run.sh"]
diff --git a/README.md b/README.md
@@ -1,3 +1,3 @@
 # LangChain GraphDB QA Chain Demo Files
 
-The repository contains the files used to demonstrate the [GraphDB QA Chain](https://python.langchain.com/docs/use_cases/graph/graph_ontotext_graphdb_qa) implemented in LangChain.
+The repository contains the files used to demonstrate the [GraphDB QA Chain](https://python.langchain.com/docs/integrations/graphs/ontotext/) implemented in LangChain.
diff --git a/docker-compose.yaml b/docker-compose.yaml
@@ -1,5 +1,3 @@
-version: '3.7'
-
 services:
 
   graphdb:

diff --git a/graph_ontotext_graphdb_qa.ipynb b/graph_ontotext_graphdb_qa.ipynb
@@ -2,18 +2,34 @@
  "cells": [
   {
    "cell_type": "markdown",
-   "id": "922a7a98-7d73-4a1a-8860-76a33451d1be",
+   "id": "1271ba5c-1700-4872-b193-f7c162944521",
    "metadata": {
-    "id": "922a7a98-7d73-4a1a-8860-76a33451d1be"
+    "execution": {
+     "iopub.execute_input": "2024-03-27T18:44:53.493675Z",
+     "iopub.status.busy": "2024-03-27T18:44:53.493473Z",
+     "iopub.status.idle": "2024-03-27T18:44:53.499541Z",
+     "shell.execute_reply": "2024-03-27T18:44:53.498940Z",
+     "shell.execute_reply.started": "2024-03-27T18:44:53.493660Z"
+    }
    },
    "source": [
-    "# Ontotext GraphDB QA Chain\n",
+    "# Ontotext GraphDB\n",
     "\n",
-    "This notebook shows how to use LLMs to provide natural language querying (NLQ to SPARQL, also called text2sparql) for [Ontotext GraphDB](https://graphdb.ontotext.com/). Ontotext GraphDB is a graph database and knowledge discovery tool compliant with [RDF](https://www.w3.org/RDF/) and [SPARQL](https://www.w3.org/TR/sparql11-query/).\n",
+    ">[Ontotext GraphDB](https://graphdb.ontotext.com/) is a graph database and knowledge discovery tool compliant with [RDF](https://www.w3.org/RDF/) and [SPARQL](https://www.w3.org/TR/sparql11-query/).\n",
     "\n",
+    ">This notebook shows how to use LLMs to provide natural language querying (NLQ to SPARQL, also called `text2sparql`) for `Ontotext GraphDB`. "
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "922a7a98-7d73-4a1a-8860-76a33451d1be",
+   "metadata": {
+    "id": "922a7a98-7d73-4a1a-8860-76a33451d1be"
+   },
+   "source": [
     "## GraphDB LLM Functionalities\n",
     "\n",
-    "GraphDB supports some LLM integration functionalities as described in [https://github.com/w3c/sparql-dev/issues/193](https://github.com/w3c/sparql-dev/issues/193):\n",
+    "`GraphDB` supports some LLM integration functionalities as described [here](https://github.com/w3c/sparql-dev/issues/193):\n",
     "\n",
     "[gpt-queries](https://graphdb.ontotext.com/documentation/10.5/gpt-queries.html)\n",
     "\n",
@@ -43,33 +59,45 @@
     "\n",
     "* A simple chatbot using a defined KG entity index\n",
     "\n",
-    "## Querying the GraphDB Database\n",
     "\n",
-    "For this tutorial, we won't use the GraphDB LLM integration, but SPARQL generation from NLQ. We'll use the Star Wars API (SWAPI) ontology and dataset that you can examine [here](https://github.com/Ontotext-AD/langchain-graphdb-qa-chain-demo/blob/main/starwars-data.trig).\n",
-    "\n",
-    "You will need to have a running GraphDB instance. This tutorial shows how to run the database locally using the [GraphDB Docker image](https://hub.docker.com/r/ontotext/graphdb). It provides a docker compose set-up, which populates GraphDB with the Star Wars dataset. All nessessary files including this notebook can be downloaded from [the GitHub repository langchain-graphdb-qa-chain-demo](https://github.com/Ontotext-AD/langchain-graphdb-qa-chain-demo).\n",
+    "For this tutorial, we won't use the GraphDB LLM integration, but `SPARQL` generation from NLQ. We'll use the `Star Wars API` (`SWAPI`) ontology and dataset that you can examine [here](https://github.com/Ontotext-AD/langchain-graphdb-qa-chain-demo/blob/main/starwars-data.trig).\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "45b464ff-8556-403f-a3d6-14ffcd703313",
+   "metadata": {},
+   "source": [
+    "## Setting up\n",
     "\n",
-    "### Set-up\n",
+    "You need a running GraphDB instance. This tutorial shows how to run the database locally using the [GraphDB Docker image](https://hub.docker.com/r/ontotext/graphdb). It provides a docker compose set-up, which populates GraphDB with the Star Wars dataset. All necessary files including this notebook can be downloaded from [the GitHub repository langchain-graphdb-qa-chain-demo](https://github.com/Ontotext-AD/langchain-graphdb-qa-chain-demo).\n",
     "\n",
     "* Install [Docker](https://docs.docker.com/get-docker/). This tutorial is created using Docker version `24.0.7` which bundles [Docker Compose](https://docs.docker.com/compose/). For earlier Docker versions you may need to install Docker Compose separately.\n",
     "* Clone [the GitHub repository langchain-graphdb-qa-chain-demo](https://github.com/Ontotext-AD/langchain-graphdb-qa-chain-demo) in a local folder on your machine.\n",
     "* Start GraphDB with the following script executed from the same folder\n",
-    "  ```\n",
-    "  docker build --tag graphdb .\n",
-    "  docker compose up -d graphdb\n",
-    "  ```\n",
+    "  \n",
+    "```\n",
+    "docker build --tag graphdb .\n",
+    "docker compose up -d graphdb\n",
+    "```\n",
+    "\n",
     "  You need to wait a couple of seconds for the database to start on `http://localhost:7200/`. The Star Wars dataset `starwars-data.trig` is automatically loaded into the `langchain` repository. The local SPARQL endpoint `http://localhost:7200/repositories/langchain` can be used to run queries against. You can also open the GraphDB Workbench from your favourite web browser `http://localhost:7200/sparql` where you can make queries interactively.\n",
-    "* Working environment\n",
+    "* Set up working environment\n",
+    "\n",
+    "If you use `conda`, create and activate a new conda environment, e.g.:\n",
+    "\n",
+    "```\n",
+    "conda create -n graph_ontotext_graphdb_qa python=3.12\n",
+    "conda activate graph_ontotext_graphdb_qa\n",
+    "```\n",
     "\n",
-    "If you use `conda`, create and activate a new conda env (e.g. `conda create -n graph_ontotext_graphdb_qa python=3.9.18`).\n",
     "Install the following libraries:\n",
     "\n",
     "```\n",
-    "pip install jupyter==1.0.0\n",
-    "pip install openai==1.6.1\n",
-    "pip install rdflib==7.0.0\n",
-    "pip install langchain-openai==0.0.2\n",
-    "pip install langchain>=0.1.5\n",
+    "pip install jupyter==1.1.1\n",
+    "pip install rdflib==7.1.1\n",
+    "pip install langchain-community==0.3.4\n",
+    "pip install langchain-openai==0.2.4\n",
     "```\n",
     "\n",
     "Run Jupyter with\n",
@@ -85,7 +113,7 @@
     "id": "e51b397c-2fdc-4b99-9fed-1ab2b6ef7547"
    },
    "source": [
-    "### Specifying the Ontology\n",
+    "## Specifying the ontology\n",
     "\n",
     "In order for the LLM to be able to generate SPARQL, it needs to know the knowledge graph schema (the ontology). It can be provided using one of two parameters on the `OntotextGraphDBGraph` class:\n",
     "\n",
@@ -196,7 +224,7 @@
     "id": "446d8a00-c98f-43b8-9e84-77b244f7bb24"
    },
    "source": [
-    "### Question Answering against the StarWars Dataset\n",
+    "## Question Answering against the StarWars dataset\n",
     "\n",
     "We can now use the `OntotextGraphDBQAChain` to ask some questions."
    ]
@@ -231,6 +259,7 @@
     "    ChatOpenAI(temperature=0, model_name=\"gpt-4-1106-preview\"),\n",
     "    graph=graph,\n",
     "    verbose=True,\n",
+    "    allow_dangerous_requests=True,\n",
     ")"
    ]
   },
@@ -308,6 +337,7 @@
       "\u001b[32;1m\u001b[1;3mPREFIX : <https://swapi.co/vocabulary/>\n",
       "PREFIX owl: <http://www.w3.org/2002/07/owl#>\n",
       "PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>\n",
+      "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\n",
       "\n",
       "SELECT ?climate\n",
       "WHERE {\n",
@@ -359,11 +389,9 @@
       "\u001b[1m> Entering new OntotextGraphDBQAChain chain...\u001b[0m\n",
       "Generated SPARQL:\n",
       "\u001b[32;1m\u001b[1;3mPREFIX : <https://swapi.co/vocabulary/>\n",
-      "PREFIX owl: <http://www.w3.org/2002/07/owl#>\n",
-      "PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>\n",
       "PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>\n",
       "\n",
-      "SELECT (AVG(?boxOffice) AS ?averageBoxOffice)\n",
+      "SELECT (AVG(?boxOffice) AS ?averageBoxOfficeRevenue)\n",
       "WHERE {\n",
       "  ?film a :Film .\n",
       "  ?film :boxOffice ?boxOfficeValue .\n",
@@ -400,12 +428,12 @@
     "id": "11511345-8436-4634-92c6-36f2c0dd44db"
    },
    "source": [
-    "### Chain Modifiers\n",
+    "## Chain modifiers\n",
     "\n",
     "The Ontotext GraphDB QA chain allows prompt refinement for further improvement of your QA chain and enhancing the overall user experience of your app.\n",
     "\n",
     "\n",
-    "#### \"SPARQL Generation\" Prompt\n",
+    "### \"SPARQL Generation\" prompt\n",
     "\n",
     "The prompt is used for the SPARQL query generation based on the user question and the KG schema.\n",
     "\n",
@@ -436,7 +464,7 @@
     "    )\n",
     "  ````\n",
     "\n",
-    "#### \"SPARQL Fix\" Prompt\n",
+    "### \"SPARQL Fix\" prompt\n",
     "\n",
     "Sometimes, the LLM may generate a SPARQL query with syntactic errors or missing prefixes, etc. The chain will try to amend this by prompting the LLM to correct it a certain number of times.\n",
     "\n",
@@ -475,7 +503,7 @@
     "  \n",
     "    Default value: `5`\n",
     "\n",
-    "#### \"Answering\" Prompt\n",
+    "### \"Answering\" prompt\n",
     "\n",
     "The prompt is used for answering the question based on the results returned from the database and the initial user question. By default, the LLM is instructed to only use the information from the returned result(s). If the result set is empty, the LLM should inform that it can't answer the question.\n",
     "\n",
@@ -535,7 +563,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.1"
+   "version": "3.12.7"
   }
  },
  "nbformat": 4,