deepsense-ai · ds-sebastianchwilczynski · May 7, 2024 · May 8, 2024 · May 9, 2024 · May 9, 2024
diff --git a/docs/tutorials/langgraphXdbally2.ipynb b/docs/tutorials/langgraphXdbally2.ipynb
@@ -0,0 +1,386 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Talk with your Database\n",
+    "\n",
+    "\n",
+    "Databases are an ineavitable part of every company's infrastructure. Chatbots capable of interacting with databases can free up teams' time by handling novel user queries.\n",
+    "\n",
+    "In this tutorial, we will build an agent with access to the database tool, being able to ground its answers with data stored there. Along the way we will create:\n",
+    "1. Custom LangChain tool.\n",
+    "2. Assistant agent with access to database tool.\n",
+    "3. Tool agent, specialized in executing calls returned by an assistant.\n",
+    "4. Graph of connected agents.\n",
+    "5. Persistent storage component.\n",
+    "\n",
+    "By the end, you'll be able to mix this simple strategy with other even more powerful LangGraph concepts.\n",
+    "\n",
+    "\n",
+    "## Prerequisites\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "First, set up your environment. We'll install this tutorial's prerequisites, download the test DB, and define the tools we will reuse in each section."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "!pip install -U langgraph langchain langchain_openai langchain_experimental dbally[openai,langsmith] nest_asyncio"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import getpass\n",
+    "import os\n",
+    "\n",
+    "\n",
+    "def _set_env(var: str):\n",
+    "    if not os.environ.get(var):\n",
+    "        os.environ[var] = getpass.getpass(f\"{var}: \")\n",
+    "\n",
+    "\n",
+    "_set_env(\"OPENAI_API_KEY\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Populate the database\n",
+    "\n",
+    "Here, we just fill a dummy database containing some fictional HR information."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from urllib import request\n",
+    "from sqlalchemy import create_engine, text\n",
+    "\n",
+    "print(\"Downloading the HR database\")\n",
+    "request.urlretrieve(\n",
+    "    \"https://drive.google.com/uc?export=download&id=1zo3j8x7qH8opTKyQ9qFgRpS3yqU6uTRs\", \"recruitment.db\"\n",
+    ")\n",
+    "print(\"Database downloaded\")\n",
+    "print(\"Creating the database\")\n",
+    "\n",
+    "db_engine = create_engine(\"sqlite:///recruitment.db\")\n",
+    "\n",
+    "print(\"Displaying the first 5 rows of the candidate table\")\n",
+    "\n",
+    "with db_engine.connect() as conn:\n",
+    "    rows = conn.execute(text(\"SELECT * from candidate LIMIT 5\")).fetchall()\n",
+    "\n",
+    "print(rows)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Database Tool\n",
+    "\n",
+    "Next, define our [assistant database tool](https://python.langchain.com/v0.1/docs/modules/tools/) to help it answer any questions concerning HR. Under the hood, it uses [db-ally](https://github.com/deepsense-ai/db-ally) database framework. "
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain.pydantic_v1 import BaseModel, Field\n",
+    "from typing import Optional, Type\n",
+    "from langchain.callbacks.manager import CallbackManagerForToolRun\n",
+    "from langchain.tools import BaseTool\n",
+    "\n",
+    "from dbally import Collection\n",
+    "from dbally.utils.errors import UnsupportedQueryError\n",
+    "\n",
+    "import asyncio\n",
+    "import nest_asyncio\n",
+    "\n",
+    "nest_asyncio.apply()\n",
+    "\n",
+    "\n",
+    "class DatabaseQuery(BaseModel):\n",
+    "    query: str = Field(description=\"should be a query to the database in the natural language.\")\n",
+    "\n",
+    "\n",
+    "class DballyTool(BaseTool):\n",
+    "    name = \"dbally\"\n",
+    "    description: str\n",
+    "    collection: Collection\n",
+    "    args_schema: Type[BaseModel] = DatabaseQuery\n",
+    "\n",
+    "    def _run(self, query: str, run_manager: Optional[CallbackManagerForToolRun] = None) -> str:\n",
+    "        \"\"\"Use the tool synchronously.\"\"\"\n",
+    "        try:\n",
+    "            result = asyncio.run(self.collection.ask(query))\n",
+    "\n",
+    "            if result.textual_response is not None:\n",
+    "                return result.textual_response\n",
+    "            else:\n",
+    "                return result.results\n",
+    "        except UnsupportedQueryError:\n",
+    "            return \"database master can't answer this question\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now, let's test our tool. If everything goes correctly, you should see `[{'COUNT(*)': 10}]`. In case it doesn't, first make sure that provided `OPENAI KEY` is correct."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from dbally.views.freeform.text2sql import configure_text2sql_auto_discovery, Text2SQLFreeformView\n",
+    "from dbally.llm_client.openai_client import OpenAIClient\n",
+    "import dbally\n",
+    "\n",
+    "view_config = await configure_text2sql_auto_discovery(db_engine).discover()\n",
+    "recruitment_db = dbally.create_collection(\"recruitment\", llm_client=OpenAIClient())\n",
+    "recruitment_db.add(Text2SQLFreeformView, lambda: Text2SQLFreeformView(db_engine, view_config))\n",
+    "\n",
+    "DATABASE_TOOL = DballyTool(collection=recruitment_db, description=\"useful for when you need to gather some HR data\")"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "DATABASE_TOOL._run(\"How many job offers from Apple do we have?\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### State\n",
+    "\n",
+    "Next, we define our agentic system's state as a typed dictionary containing an append-only list of messages. These messages form the chat history, which is all the state our simple assistant needs."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from typing import Annotated\n",
+    "\n",
+    "from typing_extensions import TypedDict\n",
+    "\n",
+    "from langgraph.graph.message import AnyMessage, add_messages\n",
+    "\n",
+    "\n",
+    "class State(TypedDict):\n",
+    "    messages: Annotated[list[AnyMessage], add_messages]"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Assistant Agent\n",
+    "\n",
+    "Next, define the assistant agent. This simply takes the graph state and then calls an LLM for it to predict the best response. The most important thing is that we give access to the database tool to our assistant."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langchain_core.prompts import ChatPromptTemplate\n",
+    "from langchain_core.runnables import Runnable\n",
+    "from langchain_openai import ChatOpenAI\n",
+    "\n",
+    "\n",
+    "class Assistant:\n",
+    "    def __init__(self, runnable: Runnable):\n",
+    "        self.runnable = runnable\n",
+    "\n",
+    "    def __call__(self, state: State):\n",
+    "        result = self.runnable.invoke(state)\n",
+    "        return {\"messages\": result}\n",
+    "\n",
+    "\n",
+    "primary_assistant_prompt = ChatPromptTemplate.from_messages(\n",
+    "    [\n",
+    "        (\n",
+    "            \"system\",\n",
+    "            \"You are a helpful talent aquisition assistant \"\n",
+    "            \" Use the provided tools to search for candidates, job offers, and other information to assist the user's queries. \",\n",
+    "        ),\n",
+    "        (\"placeholder\", \"{messages}\"),\n",
+    "    ]\n",
+    ")\n",
+    "\n",
+    "assistant_llm = ChatOpenAI(model=\"gpt-3.5-turbo\")\n",
+    "tools = [DATABASE_TOOL]\n",
+    "assistant_runnable = primary_assistant_prompt | assistant_llm.bind_tools(tools)\n",
+    "assistant_agent = Assistant(assistant_runnable)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's see how this agent works in separation. It is expected to see the Tool Call message generated."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "response = assistant_agent({\"messages\": [\"Do we have any software engineers?\"]})\n",
+    "response[\"messages\"].pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "But, assistant doesn't know how to execute the tools. This is why we need to finish our system by connecting all building blocks to the graph"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "### Define Graph\n",
+    "\n",
+    "Here we connect our previously generated agent by using [StateGraph](https://langchain-ai.github.io/langgraph/reference/graphs/#langgraph.graph.StateGraph), [ToolNode](https://langchain-ai.github.io/langgraph/reference/prebuilt/#toolnode), and [persistent memory](https://langchain-ai.github.io/langgraph/how-tos/persistence/) to build our final application"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from langgraph.graph import END, StateGraph\n",
+    "from langgraph.prebuilt import ToolNode, tools_condition\n",
+    "from langgraph.checkpoint.sqlite import SqliteSaver\n",
+    "\n",
+    "\n",
+    "tool_node = ToolNode(tools)\n",
+    "\n",
+    "builder = StateGraph(State)\n",
+    "builder.add_node(\"assistant\", assistant_agent)\n",
+    "builder.add_node(\"action\", tool_node)\n",
+    "builder.set_entry_point(\"assistant\")\n",
+    "\n",
+    "builder.add_edge(\"action\", \"assistant\")\n",
+    "builder.add_conditional_edges(\n",
+    "    \"assistant\",\n",
+    "    tools_condition,\n",
+    "    # \"action\" calls one of our tools. END causes the graph to terminate (and respond to the user)\n",
+    "    {\"action\": \"action\", END: END},\n",
+    ")\n",
+    "\n",
+    "memory = SqliteSaver.from_conn_string(\":memory:\")\n",
+    "graph = builder.compile(checkpointer=memory)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Example conversation\n",
+    "\n",
+    "Now it's time to try out our mighty chatbot!"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from uuid import uuid4\n",
+    "\n",
+    "unique_id = uuid4().hex[0:8]\n",
+    "\n",
+    "tutorial_questions = [\n",
+    "    \"Hi do we have any software engineers?\",\n",
+    "    \"Describe me the first candidate, please.\",\n",
+    "]\n",
+    "\n",
+    "graph_config = {\n",
+    "    \"configurable\": {\n",
+    "        \"thread_id\": unique_id,\n",
+    "    }\n",
+    "}\n",
+    "\n",
+    "for question in tutorial_questions:\n",
+    "    events = graph.stream({\"messages\": (\"user\", question)}, graph_config, stream_mode=\"values\")\n",
+    "    for event in events:\n",
+    "        event[\"messages\"][-1].pretty_print()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Congratulations! Together, we built an agentic system capable of querying the database. Good job!\n",
+    "\n",
+    "The full code that you can just copy and paste to use is available in langgraph_tools.py"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "dbally",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.14"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}