Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Weird tool call arguments, resulting in UnexpectedModelBehaviour / validation error #81

Open
intellectronica opened this issue Nov 21, 2024 · 4 comments

Comments

@intellectronica
Copy link

See https://github.com/intellectronica/pydantic-ai-experiments/blob/main/scratch.ipynb

Traceback (most recent call last):
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic_ai/_result.py", line 189, in validate
    result = self.type_adapter.validate_json(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic/type_adapter.py", line 425, in validate_json
    return self.validator.validate_json(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 2 validation errors for Question
reflection
  Field required [type=missing, input_value={'_': {'reflection': "The...n': 'Is it an animal?'}}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/missing
question
  Field required [type=missing, input_value={'_': {'reflection': "The...n': 'Is it an animal?'}}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.10/v/missing

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic_ai/agent.py", line 654, in _handle_model_response
    result_data = result_tool.validate(call)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic_ai/_result.py", line 203, in validate
    raise ToolRetryError(m) from e
pydantic_ai._result.ToolRetryError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic_ai/agent.py", line 181, in run
    either = await self._handle_model_response(model_response, deps)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic_ai/agent.py", line 657, in _handle_model_response
    self._incr_result_retry()
  File "/usr/local/python/3.12.1/lib/python3.12/site-packages/pydantic_ai/agent.py", line 751, in _incr_result_retry
    raise exceptions.UnexpectedModelBehavior(
pydantic_ai.exceptions.UnexpectedModelBehavior: Exceeded maximum retries (1) for result validation

See the prefixed "_" :? It's not there on earlier calls. Possibly a hallucination.

I think this can be avoided with strict mode. Would be great to have it as an option for OpenAI calls.

@intellectronica
Copy link
Author

@samuelcolvin ^^^^^

@samuelcolvin
Copy link
Member

weird, not sure what's going on, I ran your code, and it worked first time.

I ran it directly as a script, and it worked fine:

from enum import Enum
from textwrap import dedent
from typing import List

from pydantic import BaseModel, Field
from pydantic_ai import Agent, CallContext


class Question(BaseModel):
    reflection: str = Field(..., description='Considering the questions and answers so far, what are things we can ask next?')
    question: str = Field(..., description='The question to ask the other player')


asking_agent = Agent('openai:gpt-4o', result_type=Question)


@asking_agent.system_prompt
async def asking_agent_system_prompt(ctx: CallContext[List]) -> str:
    turns = ctx.deps
    prompt = dedent(f"""
        You are playing a game of 20 questions.
        You are trying to guess the object the other player is thinking of.
        In each turn, you can ask a yes or no question.
        The other player will answer with "yes", "no".
    """).strip()
    if len(turns) > 0:
        prompt += f"\nHere are the questions you have asked so far and the answers you have received:\n"
        prompt += '\n'.join([' * ' + turn for turn in turns])
    return prompt


class Answer(str, Enum):
    YES = 'yes'
    NO = 'no'
    YOU_WIN = 'you win'


class AnswerResponse(BaseModel):
    reflection: str = Field(..., description=(
        'Considering the question, what is the answer? '
        'Is it "yes" or "no"? Or did they guess the '
        'object and the answer is "you win"?'))
    answer: Answer = Field(..., description='The answer to the question - "yes", "no", or "you win"')


ansering_agent = Agent('openai:gpt-4o', result_type=AnswerResponse)


@ansering_agent.system_prompt
async def answering_agent_system_prompt(ctx: CallContext[str]) -> str:
    prompt = dedent(f"""
        You are playing a game of 20 questions.
        The other player is trying to guess the object you are thinking of.
        The object you are thinking of is: {ctx.deps}.
        Answer with "yes" or "no", or "you win" if the other player has guessed the object.
    """).strip()
    return prompt


def twenty_questions(mytery_object):
    turns = []
    while True:
        question = asking_agent.run_sync('Ask the next question', deps=turns).data.question
        answer = ansering_agent.run_sync(question, deps=mytery_object).data.answer.value
        if answer == Answer.YOU_WIN:
            print('You Win!')
            break
        elif len(turns) >= 20:
            print('You Lose!')
            break
        else:
            turns.append(f'{question} - {answer}')
            print(f'{len(turns)}. QUESTION: {question}\nANSWER: {answer}\n')

twenty_questions('a cat')

output:

1. QUESTION: Is it something commonly found indoors?
ANSWER: yes

2. QUESTION: Does it use electricity?
ANSWER: no

3. QUESTION: Is it used for storage?
ANSWER: no

4. QUESTION: Is it used for entertainment purposes?
ANSWER: no

5. QUESTION: Is it used for cleaning?
ANSWER: no

6. QUESTION: Is it a piece of furniture?
ANSWER: no

7. QUESTION: Is it used for writing or drawing?
ANSWER: no

8. QUESTION: Is it used for personal grooming or hygiene?
ANSWER: no

9. QUESTION: Is it used in the kitchen?
ANSWER: no

10. QUESTION: Is it related to health or safety?
ANSWER: no

11. QUESTION: Is it used for decoration?
ANSWER: no

12. QUESTION: Is it used for organizing?
ANSWER: no

13. QUESTION: Is it used for communication?
ANSWER: no

14. QUESTION: Is it used for comfort or relaxation?
ANSWER: yes

15. QUESTION: Is it something you can wear indoors?
ANSWER: no

16. QUESTION: Is it something you can sit or lie on?
ANSWER: no

17. QUESTION: Is it something you can hold or carry? 
ANSWER: yes

18. QUESTION: Is it a textile item like a pillow or a blanket?
ANSWER: no

19. QUESTION: Is it used to provide warmth?
ANSWER: no

20. QUESTION: Is it something you use to hold or support things?
ANSWER: no

You Lose!

@intellectronica
Copy link
Author

intellectronica commented Nov 21, 2024 via email

@samuelcolvin
Copy link
Member

Thanks, yup I'll look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants