Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove markdown code block markers from LLM's response #1641

Closed
wants to merge 5 commits into from

Conversation

ddragosd
Copy link

@roribio and I debugged today why it takes so long (60s+) for a task to complete when using output_pydantic .

It turns out that it was caused by gpt-4o insisting to respond with block code markers when returning the JSON.

We tracked down the fix to the Converter throwing JSONDecodeError Error and going down the path of handle_partial_json , which, for a simple JSON (120 lines, 7KB) it takes 60s+ to complete.

@bhancockio
Copy link
Collaborator

Hey @ddragosd !

Thank you for creating this PR!

Root issue:

  • The models defined in output_pydantic and output_json are not passed into the LLM calls. As a result, the outputs of a task do not match the specified output format which leads us to doing another LLM call to format the original LLM output.

Fix:

  • When output_pydantic and output_json are present, append a stringified version of the schema to the prompt.

Improvement stats:

  • When running a simple crew with 1 task and a output_pydantic , our old approach would take around 30 seconds to complete the crew kickoff
  • Half of this time was dedicated to doing a followup LLM call to convert the output of the task into the specified pydantic model.
  • Now, the outputs of tasks automatically generate the correct format so it only takes 15 seconds which is a 50% speed increase.
  • We also do one less LLM call which saves users money as well

Closed by #1651

@bhancockio bhancockio closed this Nov 26, 2024
@ddragosd
Copy link
Author

@bhancockio, I'm glad you caught a deeper issue with crewAI making two LLM calls instead of one, and improved the response time overall. This is awesome. I'm testing it right now. Thanks so much for the quick fix !!

Meanwhile, what do you think about support for [structured outputs] (https://openai.com/index/introducing-structured-outputs-in-the-api/)? I understand not all models have it today. But would you have any recommendations for how we could leverage this right now? I was gonna look for a way to support this, even work on a PR. Thanks !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants