Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry Handling for 429 Errors in Pydantic AI Agent Tools Not Working #928

Open
iacobellisdylog opened this issue Feb 14, 2025 · 1 comment
Assignees
Labels
Feature request New feature request

Comments

@iacobellisdylog
Copy link

Hello!
I am attempting to handle 429 Too Many Requests errors when executing tools in a pydantic-ai Agent using the tenacity library for retries. However, the retry mechanism only works when applied to the function that runs the entire agent (e.g., run_agent). This approach is not ideal because restarting the agent from scratch for each 429 error is inefficient.

When I attempt to apply the @retry decorator to individual tools (such as fetch_website_content and analyze_competition), the retry logic does not trigger as expected. Instead, the agent fails immediately when encountering a 429 error within a tool.

Code Snippets

  1. Retry Decorator Implementation

    from tenacity import retry, stop_after_attempt, wait_fixed, retry_if_exception, before_sleep_log
    import logging
    
    MAX_RETRY_ATTEMPTS = 3
    RETRY_WAIT_SECONDS = 60
    RETRY_STATUS_CODES = {429, 503, 504, 500}
    
    logger = logging.getLogger(__name__)
    
    def is_retryable_exception(exception):
        if hasattr(exception, "response") and getattr(exception.response, "status_code", None) in RETRY_STATUS_CODES:
            logging.warning("Retrying due to status code...")
            return True
        elif isinstance(exception, Exception) and any(str(code) in str(exception) for code in RETRY_STATUS_CODES):
            logging.warning("Retrying due to matching error message...")
            return True
        return False
  2. Applying Retry to Tools

    @retry(
        stop=stop_after_attempt(MAX_RETRY_ATTEMPTS),
        wait=wait_fixed(RETRY_WAIT_SECONDS),
        retry=retry_if_exception(is_retryable_exception),
        reraise=True,
        before_sleep=before_sleep_log(logger, logging.WARNING),
    )
    @swot_agent.tool(retries=3)
    async def fetch_website_content(_ctx: RunContext[SwotAgentDeps], url: str) -> str:
        """Fetches the HTML content of the given URL."""
        async with httpx.AsyncClient(follow_redirects=True) as http_client:
            try:
                response = await http_client.get(url)
                response.raise_for_status()
                return response.text
            except httpx.HTTPError as e:
                logging.info(f"Request failed: {str(e)}")
                raise e
  3. Retry Works Only on the Agent Run

    @retry(
        stop=stop_after_attempt(MAX_RETRY_ATTEMPTS),
        wait=wait_fixed(RETRY_WAIT_SECONDS),
        retry=retry_if_exception(is_retryable_exception),
        reraise=True,
        before_sleep=before_sleep_log(logger, logging.WARNING),
    )
    async def run_agent(url: str = ANALYZE_URL, deps: SwotAgentDeps = SwotAgentDeps()) -> SwotAnalysis | Exception:
        """Runs the SWOT analysis agent."""
        try:
            result = await swot_agent.run(
                f"Perform a comprehensive SWOT analysis for this product: {url}",
                deps=deps,
            )
            return result.data
        except Exception as e:
            logging.exception(f"Error during agent run: {e}")
            raise e

Questions

  1. Does pydantic-ai handle exceptions raised within tools in a way that interferes with tenacity’s retry mechanism?
  2. Is there a recommended way to handle retries for individual tool executions without restarting the whole agent?
  3. Are there any known limitations or interactions between pydantic-ai's tool execution and external retry decorators like tenacity?

Additional Context

  • The agent is built using pydantic-ai with a VertexAIModel backend.
  • Tools are defined using @swot_agent.tool() and are expected to handle HTTP requests and external API calls.
  • The issue occurs when executing tools that make requests to external services like Gemini AI or HTTP endpoints.

Any guidance or recommended fixes would be appreciated!

@sydney-runkle sydney-runkle added the Feature request New feature request label Feb 14, 2025
@sydney-runkle
Copy link
Member

This should be solved with #516 support!

@sydney-runkle sydney-runkle self-assigned this Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature request New feature request
Projects
None yet
Development

No branches or pull requests

2 participants