[Feature] Agent output parser #984

lucagobbi · 2024-12-08T14:39:01Z

Description

I'd like to propose this new hook agent_output_parser, which allows developers to replace the default output parser (StrOutputParser) used by the Cat agent.

Similar to rabbithole_instantiates_splitter, this hook prioritizes the last parser defined if multiple plugins use the same hook.

Why?

Currently, the agent’s output is a simple string, which is limiting for certain use cases. Some would argue that is possible to enrich CatMessage using the before_cat_sends_message hook, but this approach requires a bunch of custom code and for structured outputs may involve an additional LLM call leading to increased latency and costs.

Agents are often used beyond conversational contexts, especially in deterministic workflows where structured outputs are central. Just like we've seen recently with frameworks like PydanticAI.

Example of Usage

from cat.mad_hatter.decorators import hook
from langchain_core.output_parsers.pydantic import PydanticOutputParser
from pydantic import BaseModel
from typing import Optional
from enum import Enum


class CustomerMood(str, Enum):
    happy = "happy"
    neutral = "neutral"
    frustrated = "frustrated"


class SuggestedAction(str, Enum):
    transfer_to_human = "transfer_to_agent"
    get_more_info = "get_more_info"
    close_ticket = "close_ticket"


class CustomerSupportResponse(BaseModel):
    answer: str
    mood: CustomerMood
    escalation_needed: bool
    suggested_action: Optional[SuggestedAction]


@hook
def agent_output_parser(agent_output_parser, cat):
    return PydanticOutputParser(pydantic_object=CustomerSupportResponse)


@hook
def agent_prompt_prefix(prefix, cat):
    prefix = """You are a customer support agent. Answer the customer's question in a friendly, empathetic way."""
    return prefix


@hook
def agent_prompt_suffix(prompt_suffix, cat):
    prompt_suffix += "\n\n{format_instructions}"
    return prompt_suffix


@hook
def before_agent_starts(agent_input, cat):
    agent_input["format_instructions"] = PydanticOutputParser(pydantic_object=CustomerSupportResponse).get_format_instructions()
    return agent_input


@hook
def before_cat_sends_message(message, cat):
    message.content = str(message.content)
    return message

As you can see, the agent_output_parser does not work by itself. You need to provide the agent instructions on how to format the output of the chain.

However, with just a few lines of code you can build a support agent with structured outputs to be used in your own application flow.

Doubts

The agent_output_parser works within the MemoryAgent flow and some would argue that MemoryAgent should strictly handle memory-related tasks, but since it is our final step in the flow, it makes it a logical place to apply this hook.
I know it's time to pause a little and build more instead of overthink and cumulate too many features. This is why the draft PR. Let's see it it gather interest.

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to not work as expected)
This change requires a documentation update

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas

pieroit · 2024-12-08T15:07:07Z

Thank you for thinking so carefully about this.
Here are a few observations:

more and more standards are coming related to LLM agents (chat completions, MCP, json output are a few examples)
there are many libs creating nice primitives for agents (langchain, crewai, autogen, pydabticai)
no agent flow is going to satisfy every need, the combinatorics of this is astronomical

In the use case you describe, I see more the need of a custom endpoint or a custom agent (and agent router) than the need for a custom parser.

I propose to focus on giving modular and independent primitives to build custom agents instead of plugging more hooks in the default cat agent.

For example, MemoryAgent will not improve by adding more hooks, but by improving the recall function and prompt filling methods, so with few lines of code you can write the full agent, able to retrieve and fill the prompt how you want, without touching 3 or 4 hooks.

Let's see if other contribs can enrich the discussion further, thanks again @lucagobbi

lucagobbi added 2 commits December 7, 2024 18:38

agent output parser hook

32d6219

added docstring to agent_output_parser hook

5939027

lucagobbi added agent Related to cat agent (reasoner / prompt engine) hooks Make the cat extensible via plugins hooks labels Dec 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Agent output parser #984

[Feature] Agent output parser #984

lucagobbi commented Dec 8, 2024

pieroit commented Dec 8, 2024

[Feature] Agent output parser #984

Are you sure you want to change the base?

[Feature] Agent output parser #984

Conversation

lucagobbi commented Dec 8, 2024

Description

Why?

Example of Usage

Doubts

Type of change

Checklist:

pieroit commented Dec 8, 2024