Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Agent output parser #984

Draft
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

lucagobbi
Copy link
Collaborator

Description

I'd like to propose this new hook agent_output_parser, which allows developers to replace the default output parser (StrOutputParser) used by the Cat agent.

Similar to rabbithole_instantiates_splitter, this hook prioritizes the last parser defined if multiple plugins use the same hook.

Why?

Currently, the agent’s output is a simple string, which is limiting for certain use cases. Some would argue that is possible to enrich CatMessage using the before_cat_sends_message hook, but this approach requires a bunch of custom code and for structured outputs may involve an additional LLM call leading to increased latency and costs.

Agents are often used beyond conversational contexts, especially in deterministic workflows where structured outputs are central. Just like we've seen recently with frameworks like PydanticAI.

Example of Usage

from cat.mad_hatter.decorators import hook
from langchain_core.output_parsers.pydantic import PydanticOutputParser
from pydantic import BaseModel
from typing import Optional
from enum import Enum


class CustomerMood(str, Enum):
    happy = "happy"
    neutral = "neutral"
    frustrated = "frustrated"


class SuggestedAction(str, Enum):
    transfer_to_human = "transfer_to_agent"
    get_more_info = "get_more_info"
    close_ticket = "close_ticket"


class CustomerSupportResponse(BaseModel):
    answer: str
    mood: CustomerMood
    escalation_needed: bool
    suggested_action: Optional[SuggestedAction]


@hook
def agent_output_parser(agent_output_parser, cat):
    return PydanticOutputParser(pydantic_object=CustomerSupportResponse)


@hook
def agent_prompt_prefix(prefix, cat):
    prefix = """You are a customer support agent. Answer the customer's question in a friendly, empathetic way."""
    return prefix


@hook
def agent_prompt_suffix(prompt_suffix, cat):
    prompt_suffix += "\n\n{format_instructions}"
    return prompt_suffix


@hook
def before_agent_starts(agent_input, cat):
    agent_input["format_instructions"] = PydanticOutputParser(pydantic_object=CustomerSupportResponse).get_format_instructions()
    return agent_input


@hook
def before_cat_sends_message(message, cat):
    message.content = str(message.content)
    return message

As you can see, the agent_output_parser does not work by itself. You need to provide the agent instructions on how to format the output of the chain.

However, with just a few lines of code you can build a support agent with structured outputs to be used in your own application flow.

Doubts

  1. The agent_output_parser works within the MemoryAgent flow and some would argue that MemoryAgent should strictly handle memory-related tasks, but since it is our final step in the flow, it makes it a logical place to apply this hook.

  2. I know it's time to pause a little and build more instead of overthink and cumulate too many features. This is why the draft PR. Let's see it it gather interest.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas

@lucagobbi lucagobbi added agent Related to cat agent (reasoner / prompt engine) hooks Make the cat extensible via plugins hooks labels Dec 8, 2024
@pieroit
Copy link
Member

pieroit commented Dec 8, 2024

Thank you for thinking so carefully about this.
Here are a few observations:

  • more and more standards are coming related to LLM agents (chat completions, MCP, json output are a few examples)
  • there are many libs creating nice primitives for agents (langchain, crewai, autogen, pydabticai)
  • no agent flow is going to satisfy every need, the combinatorics of this is astronomical

In the use case you describe, I see more the need of a custom endpoint or a custom agent (and agent router) than the need for a custom parser.

I propose to focus on giving modular and independent primitives to build custom agents instead of plugging more hooks in the default cat agent.

For example, MemoryAgent will not improve by adding more hooks, but by improving the recall function and prompt filling methods, so with few lines of code you can write the full agent, able to retrieve and fill the prompt how you want, without touching 3 or 4 hooks.

Let's see if other contribs can enrich the discussion further, thanks again @lucagobbi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent Related to cat agent (reasoner / prompt engine) hooks Make the cat extensible via plugins hooks
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants