Skip to content

Letting an agent view an image returned from a tool -- what format to use? #25881

Answered by drop-table-bugs
oe-andreas asked this question in Q&A
Discussion options

You must be logged in to vote

@oe-andreas I had similar issue while using create_react_agent() from langgraph/prebuilt.
I found that in my case create_react_agent() creates a graph with ToolNode (which is responsible for calling tools - https://langchain-ai.github.io/langgraph/reference/prebuilt/#toolnode) and that this ToolNode is turning to str the content of the message that it got from calling tool.

langgraph/prebuilt/tool_node.py

As you can see the content of the ToolMessage is converted to str. If your .content was a dict which describes image then it got converted to str with json.dumps() and as a result this ToolMessage will be treated by LLM/Chat model like a text reply.

I fixed that by literally coping the t…

Replies: 1 comment 14 replies

Comment options

You must be logged in to vote
14 replies
@oe-andreas
Comment options

@dosubot
Comment options

@oe-andreas
Comment options

@drop-table-bugs
Comment options

Answer selected by oe-andreas
@oe-andreas
Comment options

@drop-table-bugs
Comment options

@drop-table-bugs
Comment options

@aria4larry
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
6 participants