JSON output with output parsers doesn't work well with system prompt #28810

adocherty · 2024-12-19T04:09:40Z

adocherty
Dec 19, 2024

Checked other resources

I added a very descriptive title to this question.
I searched the LangChain documentation with the integrated search.
I used the GitHub search to find a similar question and didn't find it.

Commit to Help

I commit to help with one of those options 👆

Example Code

In the example outputting JSON formatted objects on this page:
https://python.langchain.com/docs/how_to/structured_output/#custom-parsing

with the following code:

from pydantic import BaseModel, Field

class Joke(BaseModel):
    """Joke to tell user."""
    setup: str = Field(description="The setup of the joke")
    punchline: str = Field(description="The punchline to the joke")
    rating: int = Field(description="How funny the joke is, from 1 to 10")

parser = PydanticOutputParser(pydantic_object=Joke)
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer the user query. Wrap the output in `json` tags\n{format_instructions}",
        ),
        ("human", "Tell me a joke about cats"),
    ]
).partial(format_instructions=parser.get_format_instructions())

structured_llm = prompt | llm | parser
structured_llm.invoke("Tell me a joke about cats")

Description

When I try the example above of JSON output using output parsers, this often fails compared to using the prompt in user mode, such as the following code:

prompt_user_format = ChatPromptTemplate.from_template(
    "{input} \n{format_instructions}"
).partial(format_instructions=parser.get_format_instructions())

structured_llm = prompt_user_format | llm_model | parser
structured_llm.invoke("Tell me a joke about cats")

My question is why are most models failing to output JSON output with the system prompt (as in the reference page) but succeed when using a user prompt? Is there something I'm missing in my code?

From my testing, see table below, on Ollama models while Gemma2 works better with a system prompt, other models such as fail to output JSON when using the system model.

The table below shows the percentage of 20 runs that return conforming JSON in LangChain. For reference I'm using Ollama local models, all smaller than 9B parameters.

Output method	Model	Test 1	Test 2	Test 3	Test 4
Output Parser User	Llama32	82.5	15	30	32.5
	Nemotron-mini	95	95	100	90
	Gemma2	97.5	67.5	92.5	25
	Phi3	77.5	37.5	65	25
Output Parser System	Llama32	0	0	0	0
	Nemotron-mini	2.5	0	2.5	5
	Gemma2	95	97.5	100	70
	Phi3	2.5	0	0	0

Full code to produce the table is available here:
https://github.com/adocherty/mastering-structured-output/blob/main/2-langchain-structured-output-evaluation.ipynb

System Info

System Information

OS: Linux
OS Version: #51-Ubuntu SMP PREEMPT_DYNAMIC Sat Nov 9 17:58:29 UTC 2024
Python Version: 3.12.3 (main, Nov 6 2024, 18:32:19) [GCC 13.2.0]

Package Information

langchain_core: 0.3.15
langchain: 0.3.7
langsmith: 0.1.142
langchain_anthropic: 0.2.4
langchain_ollama: 0.2.0
langchain_text_splitters: 0.3.2

Optional packages not installed

langgraph
langserve

Other Dependencies

aiohttp: 3.10.10
anthropic: 0.36.2
async-timeout: Installed. No version info available.
defusedxml: 0.7.1
httpx: 0.27.2
jsonpatch: 1.33
numpy: 1.26.4
ollama: 0.3.3
orjson: 3.10.11
packaging: 24.2
pydantic: 2.9.2
PyYAML: 6.0.2
requests: 2.32.3
requests-toolbelt: 1.0.0
SQLAlchemy: 2.0.36
tenacity: 9.0.0
typing-extensions: 4.12.2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON output with output parsers doesn't work well with system prompt #28810

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

JSON output with output parsers doesn't work well with system prompt #28810

adocherty Dec 19, 2024

Checked other resources

Commit to Help

Example Code

Description

System Info

System Information

Package Information

Optional packages not installed

Other Dependencies

Replies: 0 comments

adocherty
Dec 19, 2024