Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured JSON output key ordering issue causes incorrect reasoning steps #354

Open
yuan-li opened this issue Dec 16, 2024 · 5 comments
Open
Labels
status:pending implementation Feature request pending implementation from the Eng team type:feature request New feature request/enhancement

Comments

@yuan-li
Copy link

yuan-li commented Dec 16, 2024

Description of the bug:

I’ve encountered a bug when using the structured output feature with the Gemini model. Specifically, I noticed that json keys in the returned response appear sorted alphabetically rather than in the order defined by my provided schema. This seems to interfere with the chain-of-thought reasoning steps.

Actual vs expected behavior:

I just copied the example from OpenAI (https://platform.openai.com/docs/guides/structured-outputs#chain-of-thought)

from pydantic import BaseModel

class Step(BaseModel):
    explanation: str
    output: str

class MathReasoning(BaseModel):
    steps: list[Step]
    final_answer: str

model = genai.GenerativeModel("gemini-1.5-flash-8b")
result = model.generate_content(
    "how can I solve 8x + 7 = -23",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json", response_schema=MathReasoning, temperature=0
    ),
)

The response is

{'final_answer': '-5',
 'steps': [{'explanation': 'Subtract 7 from both sides of the equation to isolate the term with x.',
   'output': '8x = -30'},
  {'explanation': 'Divide both sides of the equation by 8 to solve for x.',
   'output': 'x = -30/8'},
  {'explanation': 'Simplify the fraction.', 'output': 'x = -15/4'}]}

Here, the final answer appears before steps, which disrupted the reasoning steps and resulted in an incorrect final answer. It implies that the keys might be sorted alphabetically, so a workaround would be like this:

from pydantic import BaseModel

class Step(BaseModel):
    explanation: str
    output: str

class MathReasoning(BaseModel):
    calculation_steps: list[Step]
    final_answer: str

model = genai.GenerativeModel("gemini-1.5-flash-8b")
result = model.generate_content(
    "how can I solve 8x + 7 = -23",
    generation_config=genai.GenerationConfig(
        response_mime_type="application/json", response_schema=MathReasoning, temperature=0
    ),
)

And the response is as expected this time:

{'calculation_steps': [{'explanation': 'Subtract 7 from both sides of the equation.',
   'output': '8x + 7 - 7 = -23 - 7'},
  {'explanation': 'Simplify both sides of the equation.',
   'output': '8x = -30'},
  {'explanation': 'Divide both sides of the equation by 8.',
   'output': '8x / 8 = -30 / 8'},
  {'explanation': 'Simplify both sides of the equation.',
   'output': 'x = -30/8'},
  {'explanation': 'Simplify the fraction.', 'output': 'x = -15/4'}],
 'final_answer': '-15/4'}

Any other information you'd like to share?

This behavior suggests that the keys may be sorted alphabetically internally, rather than following the schema order. It would be helpful if the model could preserve the original field order to maintain the intended reasoning flow.

@Giom-V
Copy link
Collaborator

Giom-V commented Dec 17, 2024

Hello @yuan-li,
Thank you for the feedback. I'm routing it internally to the folk in charge of structured output.

@yuan-li
Copy link
Author

yuan-li commented Dec 17, 2024

Thank you @Giom-V

@seniorb
Copy link

seniorb commented Dec 19, 2024

I'm having this issue as well, where the json is output with alphabetically arranged keys and not in the order specified

@Giom-V Giom-V added type:feature request New feature request/enhancement status:pending implementation Feature request pending implementation from the Eng team labels Jan 9, 2025
@Giom-V
Copy link
Collaborator

Giom-V commented Jan 9, 2025

The feature is still in the backlog, but in the meantime, have you tried the thinking model?

@benhylak
Copy link

Having this exact issue, and it's not alphabetical. It's a pre-defined, seemingly random order. It's a very bad bug. Like the original post said, it makes it impossible to enforce a specific reasoning flow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status:pending implementation Feature request pending implementation from the Eng team type:feature request New feature request/enhancement
Projects
None yet
Development

No branches or pull requests

4 participants