Multi-Agent workflow taking too much time for a user query using LangGraph for Llama3 using Ollama when compared to running Agents on an individual basis. #23679
Unanswered
Ikshan-Tango
asked this question in
Q&A
Replies: 1 comment 1 reply
-
@Ikshan-Tango Hello! I'm here to assist you with any bugs, questions, or contributions you might have. To optimize the response time for subsequent queries in your Multi-Agent Workflow using LangGraph for Llama3 with Ollama, you can consider the following strategies:
Here is an example of how to set the from langchain_community.llms import Ollama
from langchain_community.cache import InMemoryCache
from langchain.globals import set_llm_cache
# Initialize the cache
set_llm_cache(InMemoryCache())
# Initialize the LLM with keep_alive parameter
ollama = Ollama(
model="llama3",
temperature=0,
num_gpu=1,
top_k=20,
top_p=0.6,
keep_alive="10m", # Keep the model loaded for 10 minutes
verbose=True
)
# Example prompt
prompt = "Tell me a joke"
# First call, not yet in cache
response = ollama(prompt)
print(response)
# Second call, should be faster as it is cached
response = ollama(prompt)
print(response) By implementing these strategies, you can optimize the response time for subsequent queries in your Multi-Agent Workflow using LangGraph for Llama3 with Ollama [1][2][3][4]. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked other resources
Commit to Help
Example Code
OUTPUT :-
Description
PROBLEM STATEMENT
I'm trying to create a Multi-Agent Workflow using LangGraph and when I run the Agents on an individual script based level, the first query takes around 15-20 seconds to run but the next queries are answered within 3-5 seconds. This is the same case for all of my Agents.
Now that when I've created a Worklow for my LangGraph which contains a supervisor Agent which redirects the query to the correct Agent, each query takes around more than 100 seconds to give the actual response which should not be the case. I am willing to take a much higher response time for the first query but for the subsequent queries I need the timing to be as low as possible.
Directory structure :-
Important things to note :-
Example Agent Code
Graph Workflow :-
nodes.py file
state.py
Workflow output when executed :-
I believe my PC specs are capable enough for running the model and it should not be a hardware issue but more of a software issue.
Hardware Specs
System Info
System Information
Package Information
Packages not installed (Not Necessarily a Problem)
The following packages were not found:
Beta Was this translation helpful? Give feedback.
All reactions