diff --git a/docs/core_docs/docs/use_cases/graph/quickstart.ipynb b/docs/core_docs/docs/use_cases/graph/quickstart.ipynb index 87d6932d7da7..6cb4aa12f45c 100644 --- a/docs/core_docs/docs/use_cases/graph/quickstart.ipynb +++ b/docs/core_docs/docs/use_cases/graph/quickstart.ipynb @@ -1,284 +1,258 @@ { "cells": [ - { - "cell_type": "raw", - "metadata": {}, - "source": [ - "---\n", - "sidebar_position: 0\n", - "---" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Quickstart\n", - "\n", - "In this guide we'll go over the basic ways to create a Q&A chain over a graph database. These systems will allow us to ask a question about the data in a graph database and get back a natural language answer.\n", - "\n", - "## ⚠️ Security note ⚠️\n", - "\n", - "Building Q&A systems of graph databases requires executing model-generated graph queries. There are inherent risks in doing this. Make sure that your database connection permissions are always scoped as narrowly as possible for your chain/agent's needs. This will mitigate though not eliminate the risks of building a model-driven system. For more on general security best practices, [see here](/docs/security).\n", - "\n", - "## Architecture\n", - "\n", - "At a high-level, the steps of most graph chains are:\n", - "\n", - "1. **Convert question to a graph database query**: Model converts user input to a graph database query (e.g. Cypher).\n", - "2. **Execute graph database query**: Execute the graph database query.\n", - "3. **Answer the question**: Model responds to user input using the query results.\n", - "\n", - "\n", - "![SQL Use Case Diagram](../../../static/img/graph_usecase.png)\n", - "\n", - "## Setup\n", - "\n", - "First, get required packages and set environment variables.\n", - "In this example, we will be using Neo4j graph database." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Setup\n", - "#### Install dependencies\n", - "\n", - "```{=mdx}\n", - "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n", - "import Npm2Yarn from \"@theme/Npm2Yarn\";\n", - "\n", - "\n", - "\n", - "\n", - " langchain @langchain/community @langchain/openai neo4j-driver\n", - "\n", - "```\n", - "\n", - "#### Set environment variables\n", - "\n", - "We'll use OpenAI in this example:\n", - "\n", - "```env\n", - "OPENAI_API_KEY=your-api-key\n", - "\n", - "# Optional, use LangSmith for best-in-class observability\n", - "LANGSMITH_API_KEY=your-api-key\n", - "LANGCHAIN_TRACING_V2=true\n", - "```\n", - "\n", - "Next, we need to define Neo4j credentials.\n", - "Follow [these installation steps](https://neo4j.com/docs/operations-manual/current/installation/) to set up a Neo4j database.\n", - "\n", - "```env\n", - "NEO4J_URI=\"bolt://localhost:7687\"\n", - "NEO4J_USERNAME=\"neo4j\"\n", - "NEO4J_PASSWORD=\"password\"\n", - "```" + { + "cell_type": "raw", + "metadata": {}, + "source": [ + "---\n", + "sidebar_position: 0\n", + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Quickstart\n", + "\n", + "In this guide we'll go over the basic ways to create a Q&A chain over a graph database. These systems will allow us to ask a question about the data in a graph database and get back a natural language answer.\n", + "\n", + "## ⚠️ Security note ⚠️\n", + "\n", + "Building Q&A systems of graph databases requires executing model-generated graph queries. There are inherent risks in doing this. Make sure that your database connection permissions are always scoped as narrowly as possible for your chain/agent's needs. This will mitigate though not eliminate the risks of building a model-driven system. For more on general security best practices, [see here](/docs/security).\n", + "\n", + "## Architecture\n", + "\n", + "At a high-level, the steps of most graph chains are:\n", + "\n", + "1. **Convert question to a graph database query**: Model converts user input to a graph database query (e.g. Cypher).\n", + "2. **Execute graph database query**: Execute the graph database query.\n", + "3. **Answer the question**: Model responds to user input using the query results.\n", + "\n", + "\n", + "![SQL Use Case Diagram](../../../static/img/graph_usecase.png)\n", + "\n", + "## Setup\n", + "\n", + "First, get required packages and set environment variables.\n", + "In this example, we will be using Neo4j graph database." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup\n", + "#### Install dependencies\n", + "\n", + "```{=mdx}\n", + "import IntegrationInstallTooltip from \"@mdx_components/integration_install_tooltip.mdx\";\n", + "import Npm2Yarn from \"@theme/Npm2Yarn\";\n", + "\n", + "\n", + "\n", + "\n", + " langchain @langchain/community @langchain/openai neo4j-driver\n", + "\n", + "```\n", + "\n", + "#### Set environment variables\n", + "\n", + "We'll use OpenAI in this example:\n", + "\n", + "```env\n", + "OPENAI_API_KEY=your-api-key\n", + "\n", + "# Optional, use LangSmith for best-in-class observability\n", + "LANGSMITH_API_KEY=your-api-key\n", + "LANGCHAIN_TRACING_V2=true\n", + "```\n", + "\n", + "Next, we need to define Neo4j credentials.\n", + "Follow [these installation steps](https://neo4j.com/docs/operations-manual/current/installation/) to set up a Neo4j database.\n", + "\n", + "```env\n", + "NEO4J_URI=\"bolt://localhost:7687\"\n", + "NEO4J_USERNAME=\"neo4j\"\n", + "NEO4J_PASSWORD=\"password\"\n", + "```" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "The below example will create a connection with a Neo4j database and will populate it with example data about movies and their actors." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Schema refreshed successfully.\n" ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "The below example will create a connection with a Neo4j database and will populate it with example data about movies and their actors." - ] - }, - { - "cell_type": "code", + }, + { + "data": { + "text/plain": [ + "[]" + ] + }, "execution_count": 3, "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Schema refreshed successfully.\n" - ] - }, - { - "data": { - "text/plain": [ - "[]" - ] - }, - "execution_count": 3, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import \"neo4j-driver\";\n", - "import { Neo4jGraph } from \"@langchain/community/graphs/neo4j_graph\";\n", - "\n", - "const url = Deno.env.get(\"NEO4J_URI\");\n", - "const username = Deno.env.get(\"NEO4J_USER\");\n", - "const password = Deno.env.get(\"NEO4J_PASSWORD\");\n", - "const graph = await Neo4jGraph.initialize({ url, username, password });\n", - "\n", - "// Import movie information\n", - "const moviesQuery = `LOAD CSV WITH HEADERS FROM \n", - "'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'\n", - "AS row\n", - "MERGE (m:Movie {id:row.movieId})\n", - "SET m.released = date(row.released),\n", - " m.title = row.title,\n", - " m.imdbRating = toFloat(row.imdbRating)\n", - "FOREACH (director in split(row.director, '|') | \n", - " MERGE (p:Person {name:trim(director)})\n", - " MERGE (p)-[:DIRECTED]->(m))\n", - "FOREACH (actor in split(row.actors, '|') | \n", - " MERGE (p:Person {name:trim(actor)})\n", - " MERGE (p)-[:ACTED_IN]->(m))\n", - "FOREACH (genre in split(row.genres, '|') | \n", - " MERGE (g:Genre {name:trim(genre)})\n", - " MERGE (m)-[:IN_GENRE]->(g))`\n", - "\n", - "await graph.query(moviesQuery);" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "## Graph schema\n", - "\n", - "In order for an LLM to be able to generate a Cypher statement, it needs information about the graph schema. When you instantiate a graph object, it retrieves the information about the graph schema. If you later make any changes to the graph, you can run the `refreshSchema` method to refresh the schema information." + "output_type": "execute_result" + } + ], + "source": [ + "import \"neo4j-driver\";\n", + "import { Neo4jGraph } from \"@langchain/community/graphs/neo4j_graph\";\n", + "\n", + "const url = Deno.env.get(\"NEO4J_URI\");\n", + "const username = Deno.env.get(\"NEO4J_USER\");\n", + "const password = Deno.env.get(\"NEO4J_PASSWORD\");\n", + "const graph = await Neo4jGraph.initialize({ url, username, password });\n", + "\n", + "// Import movie information\n", + "const moviesQuery = `LOAD CSV WITH HEADERS FROM \n", + "'https://raw.githubusercontent.com/tomasonjo/blog-datasets/main/movies/movies_small.csv'\n", + "AS row\n", + "MERGE (m:Movie {id:row.movieId})\n", + "SET m.released = date(row.released),\n", + " m.title = row.title,\n", + " m.imdbRating = toFloat(row.imdbRating)\n", + "FOREACH (director in split(row.director, '|') | \n", + " MERGE (p:Person {name:trim(director)})\n", + " MERGE (p)-[:DIRECTED]->(m))\n", + "FOREACH (actor in split(row.actors, '|') | \n", + " MERGE (p:Person {name:trim(actor)})\n", + " MERGE (p)-[:ACTED_IN]->(m))\n", + "FOREACH (genre in split(row.genres, '|') | \n", + " MERGE (g:Genre {name:trim(genre)})\n", + " MERGE (m)-[:IN_GENRE]->(g))`\n", + "\n", + "await graph.query(moviesQuery);" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Graph schema\n", + "\n", + "In order for an LLM to be able to generate a Cypher statement, it needs information about the graph schema. When you instantiate a graph object, it retrieves the information about the graph schema. If you later make any changes to the graph, you can run the `refreshSchema` method to refresh the schema information." + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Node properties are the following:\n", + "Movie {imdbRating: FLOAT, id: STRING, released: DATE, title: STRING}, Person {name: STRING}, Genre {name: STRING}\n", + "Relationship properties are the following:\n", + "\n", + "The relationships are the following:\n", + "(:Movie)-[:IN_GENRE]->(:Genre), (:Person)-[:DIRECTED]->(:Movie), (:Person)-[:ACTED_IN]->(:Movie)\n" ] - }, - { - "cell_type": "code", - "execution_count": 4, - "metadata": {}, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "Node properties are the following:\n", - "Movie {imdbRating: FLOAT, id: STRING, released: DATE, title: STRING}, Person {name: STRING}, Genre {name: STRING}\n", - "Relationship properties are the following:\n", - "\n", - "The relationships are the following:\n", - "(:Movie)-[:IN_GENRE]->(:Genre), (:Person)-[:DIRECTED]->(:Movie), (:Person)-[:ACTED_IN]->(:Movie)\n" - ] - } - ], - "source": [ - "await graph.refreshSchema()\n", - "console.log(graph.schema)" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "Great! We've got a graph database that we can query. Now let's try hooking it up to an LLM.\n", - "\n", - "## Chain\n", - "\n", - "Let's use a simple chain that takes a question, turns it into a Cypher query, executes the query, and uses the result to answer the original question.\n", - "\n", - "![graph_chain.webp](../../../static/img/graph_chain.webp)\n", - "\n", - "\n", - "LangChain comes with a built-in chain for this workflow that is designed to work with Neo4j: [GraphCypherQAChain](https://python.langchain.com/docs/use_cases/graph/graph_cypher_qa)" - ] - }, - { - "cell_type": "code", + } + ], + "source": [ + "await graph.refreshSchema()\n", + "console.log(graph.schema)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Great! We've got a graph database that we can query. Now let's try hooking it up to an LLM.\n", + "\n", + "## Chain\n", + "\n", + "Let's use a simple chain that takes a question, turns it into a Cypher query, executes the query, and uses the result to answer the original question.\n", + "\n", + "![graph_chain.webp](../../../static/img/graph_chain.webp)\n", + "\n", + "\n", + "LangChain comes with a built-in chain for this workflow that is designed to work with Neo4j: [GraphCypherQAChain](https://python.langchain.com/docs/use_cases/graph/graph_cypher_qa)" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "data": { + "text/plain": [ + "{ result: \u001b[32m\"James Woods, Joe Pesci, Robert De Niro, Sharon Stone\"\u001b[39m }" + ] + }, "execution_count": 5, "metadata": {}, - "outputs": [ - { - "data": { - "text/plain": [ - "{ result: \u001b[32m\"James Woods, Joe Pesci, Robert De Niro, Sharon Stone\"\u001b[39m }" - ] - }, - "execution_count": 5, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "import { GraphCypherQAChain } from \"langchain/chains/graph_qa/cypher\";\n", - "import { ChatOpenAI } from \"@langchain/openai\";\n", - "\n", - "const llm = new ChatOpenAI({ model: \"gpt-3.5-turbo\", temperature: 0 })\n", - "const chain = GraphCypherQAChain.fromLLM({\n", - " llm,\n", - " graph,\n", - "});\n", - "const response = await chain.invoke({ query: \"What was the cast of the Casino?\" })\n", - "response" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Next steps\n", - "\n", - "For more complex query-generation, we may want to create few-shot prompts or add query-checking steps. For advanced techniques like this and more check out:\n", - "\n", - "* [Prompting strategies](/docs/use_cases/graph/prompting): Advanced prompt engineering techniques.\n", - "* [Mapping values](/docs/use_cases/graph/mapping): Techniques for mapping values from questions to database.\n", - "* [Semantic layer](/docs/use_cases/graph/semantic): Techniques for working implementing semantic layers." - ] - } - ], - "source": [ - "import { GraphCypherQAChain } from \"langchain/chains/graph_qa/cypher\";\n", - "import { ChatOpenAI } from \"@langchain/openai\";\n", - "\n", - "const llm = new ChatOpenAI({ modelName: \"gpt-3.5-turbo\", temperature: 0 })\n", - "const chain = GraphCypherQAChain.fromLLM({\n", - " llm,\n", - " graph,\n", - "});\n", - "const response = await chain.invoke({ query: \"What was the cast of the Casino?\" })\n", - "response" - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "### Next steps\n", - "\n", - "For more complex query-generation, we may want to create few-shot prompts or add query-checking steps. For advanced techniques like this and more check out:\n", - "\n", - "* [Prompting strategies](/docs/use_cases/graph/prompting): Advanced prompt engineering techniques.\n", - "* [Mapping values](/docs/use_cases/graph/mapping): Techniques for mapping values from questions to database.\n", - "* [Semantic layer](/docs/use_cases/graph/semantic): Techniques for working implementing semantic layers.\n", - "* [Constructing graphs](/docs/use_cases/graph/construction): Techniques for constructing knowledge graphs.\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": {}, - "outputs": [], - "source": [] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Deno", - "language": "typescript", - "name": "deno" + "output_type": "execute_result" + } + ], + "source": [ + "import { GraphCypherQAChain } from \"langchain/chains/graph_qa/cypher\";\n", + "import { ChatOpenAI } from \"@langchain/openai\";\n", + "\n", + "const llm = new ChatOpenAI({ model: \"gpt-3.5-turbo\", temperature: 0 })\n", + "const chain = GraphCypherQAChain.fromLLM({\n", + " llm,\n", + " graph,\n", + "});\n", + "const response = await chain.invoke({ query: \"What was the cast of the Casino?\" })\n", + "response" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Next steps\n", + "\n", + "For more complex query-generation, we may want to create few-shot prompts or add query-checking steps. For advanced techniques like this and more check out:\n", + "\n", + "* [Prompting strategies](/docs/use_cases/graph/prompting): Advanced prompt engineering techniques.\n", + "* [Mapping values](/docs/use_cases/graph/mapping): Techniques for mapping values from questions to database.\n", + "* [Semantic layer](/docs/use_cases/graph/semantic): Techniques for working implementing semantic layers.\n", + "* [Constructing graphs](/docs/use_cases/graph/construction): Techniques for constructing knowledge graphs.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Deno", + "language": "typescript", + "name": "deno" + }, + "language_info": { + "file_extension": ".ts", + "mimetype": "text/x.typescript", + "name": "typescript", + "nb_converter": "script", + "pygments_lexer": "typescript", + "version": "5.4.3" + } }, - "language_info": { - "file_extension": ".ts", - "mimetype": "text/x.typescript", - "name": "typescript", - "nb_converter": "script", - "pygments_lexer": "typescript", - "version": "5.4.3" - } - }, - "nbformat": 4, - "nbformat_minor": 4 -} + "nbformat": 4, + "nbformat_minor": 4 + } + \ No newline at end of file