Let's build a simple graph with 3 nodes and one conditional edge.
Let's build up to a simple chain that combines 4 concepts:
- Using chat messages as our graph state
- Using chat models in graph nodes
- Binding tools to our chat model
- Executing tool calls in graph nodes
Let's build a router, where the chat model routes between a direct response or a tool call based upon the user input. This is an simple example of an agent, where the LLM is directing the control flow either by calling a tool or just responding directly.
Let's build an agent using a general ReAct architecture
- Act - let the model call specific tools
- Observe - pass the tool output back to the model
- Reason - let the model reason about the tool output to decide what to do next (e.g., call another tool or just respond directly)
Let's build an agent using a general ReAct architecture with Memory
- Act - let the model call specific tools
- Observe - pass the tool output back to the model
- Reason - let the model reason about the tool output to decide what to do next (e.g., call another tool or just respond directly)
We will extend our agent by introducing memory.
The state schema represents the structure and types of data that our graph will use. All nodes are expected to communicate with that schema.
We will use
TypeDict
class from python'styping
module.DataClasses
from pythonPydantic
- data validation and settings management library using Python type annotations.
The reducers, which specify how state updates are performed on specific keys / channels in the state schema.
We will use
Annotated
type with reducer function likeoperator.add
.Annotated
type with custom reducer function likereduce_list
.MessagesState
Re-writing
andRemoval
of messages.
The multiple schemas are needed when,
- Internal nodes may pass information that is not required in the graph's input / output.
- We may also want to use different input / output schemas for the graph.
Filtering and Trimming messages using,
- Remove Message with MessageState
- Filtering Messages
- Trimming Messages
Let's create a simple Chatbot with conversation summary. We'll equip that Chatbot with memory, supporting long-running conversations.
Let's upgrade our Chatbot with conversation summary and external memory (SqliteSaver checkpointer), supporting long-running conversations and chat presistence.
.stream
and.astream
are sync and async methods for streaming back results- Streaming graph state using streaming modes - 'updates' and 'values'
.astream
events and each event is a dict with a few keys:
event
: This is the type of event that is being emitted.name
: This is the name of event.data
: This is the data associated with the event.metadata
: Containslanggraph_node
, the node emitting the event.
For breakpoint, we need to simply compile the graph with interrupt_before=["tools"]
where tools
is our tools node.
This means that the execution will be interrupted before the node tools
, which executes the tool call.
Using breakpoints to modify the graph state.
This is an internal breakpoint to allow the graph dynamically interrupt itself!
This has a few specific benefits:
- Can do it conditionally (from inside a node based on developer-defined logic).
- Can communicate to the user why its interrupted (by passing whatever you want to the
NodeInterrupt
).
Time travel in LangGraph supports debugging by viewing, re-playing, and even forking from past states.
We can do this by:
We can excecute the nodes in parallel (as required by the situation) using,
- Fan-in and Fan-out
- Waiting for other parallel node to finish
- Setting the order of the state updates
Sub-graphs allow you to create and manage different states in different parts of your graph. This is particularly useful for multi-agent systems, with teams of agents that each have their own state.
Let's consider an example:
- I have a system that accepts logs
- It performs two separate sub-tasks by different agents (summarize logs, find failure modes)
- I want to perform these two operations in two different sub-graphs.
The most critical thing to understand is how the graphs communicate!
Map-reduce operations are essential for efficient task decomposition and parallel processing.
It has two phases:
Map
- Break a task into smaller sub-tasks, processing each sub-task in parallel.Reduce
- Aggregate the results across all of the completed, parallelized sub-tasks.
Research is often laborious work offloaded to analysts. AI has considerable potential to assist with this.
However, research demands customization: raw LLM outputs are often poorly suited for real-world decision-making workflows.
Customized, AI-based [research and report generation] workflows are a promising way to address this.
Let's build a chatbot that uses both short-term (within-thread)
and long-term (across-thread)
memory.
We'll focus on long-term, which will be facts about the user. These long-term memories will be used to create a personalized chatbot that can remember facts about the user.
It will save memory, as the user is chatting with it.
Our chatbot saved memories as a string. In practice, we often want memories to have a structure.
In our case, we want this to be a single user profile. We'll extend our chatbot to save semantic memories to a single user profile
We'll also use a library, Trustcall
, to update this schema with new information.
Sometimes we want to save memories to a collection
rather than single profile.
Let's update our chatbot to save memories to a collection.
We'll also show how to use Trustcall
to update this collection.
Let's pull together the pieces learned to build an agent with long-term memory.
The following information should be provided to create a LangGraph Platform deployment:
A LangGraph API Configuration file - langgraph.json
The graphs that implement the logic of the application - e.g., task_maistro.py
A file that specifies dependencies required to run the application - requirements.txt
Supply environment variables needed for the application to run - .env
or docker-compose.yml
We can access the deployment through:
Docs: http://localhost:8123/docs
LangGraph Studio: https://smith.langchain.com/studio/?baseUrl=http://127.0.0.1:8123
LangGraph Server exposes many API endpoints for interacting with the deployed agent. These endpoints are group into a few common agent needs:
Runs: Atomic agent executions
Threads: Multi-turn interactions or human in the loop
Store: Long-term memory
Seamless handling of double texting is important for handling real-world usage scenarios, especially in chat applications.
Users can send multiple messages in a row before the prior run(s) complete, and we want to ensure that we handle this gracefully.
We can follow below approaches to handle the different scenarios
Reject: A simple approach is to reject any new runs until the current run completes.
Enqueue: Enqueue any new runs until the current run completes.
Interrupt: Interrupt to interrupt the current run, but save all the work that has been done so far up to that point.
Rollback: Rollback to interrupt the prior run of the graph, delete it, and start a new run with the double-texted input.