Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support structured output in ChatDatabricks #28

Merged
merged 1 commit into from
Oct 18, 2024

Conversation

B-Step62
Copy link
Collaborator

@B-Step62 B-Step62 commented Oct 17, 2024

Resolve #19

Implement with_structured_output() method in ChatDatabricks to support structured output feature compatible with OpenAI. The code mostly copies ChatOpenAI implementation with slight adjustment e.g. FMAPI doesn't support parallel_tool_calls parameter.

Test

  • Unit test
  • Integration test via local machine.
  • Tested on Databricks:

Screenshot 2024-10-17 at 16 14 37

@B-Step62
Copy link
Collaborator Author

@harupy @BenWilson2 Could you review this change that mostly copies what OAI does? Wanted to close the FR before releasing the next version by the end of this week.

)
else:
output_parser = JsonOutputKeyToolsParser(
key_name=tool_name, first_tool_only=True
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the OAI implementation, they're doing the same thing?
IIRC their tool requests are sequential, but it will be good to double check.

Copy link
Collaborator Author

@B-Step62 B-Step62 Oct 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes https://github.com/langchain-ai/langchain/blob/4fab8996cf3a5a34bd5333c6848b0bccf798a6a0/libs/partners/openai/langchain_openai/chat_models/base.py#L1234-L1236

I think the rational is that it is guaranteed to have only one tool matches with the tool_name we pass here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM! Let's merge :)

Copy link
Collaborator

@BenWilson2 BenWilson2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall LGTM - let's verify that the tool response requests from OpenAI are always provided as sequential iterations (I BELIEVE that is how their APIs work, but let's be sure we're not inadvertently dropping multiple tool requests; for context, Anthropic adheres to a similar response structure but their ToolRequestFormat is a collection instead of sequential calls)

@B-Step62
Copy link
Collaborator Author

B-Step62 commented Oct 18, 2024

Reading through the code closely, the more direct reason is that LLM can only returns single tool call for structured output because we (and OAI/Anthoropic) set tool_choice parameter to only include the formatting tool🙂

https://github.com/langchain-ai/langchain/blob/4fab8996cf3a5a34bd5333c6848b0bccf798a6a0/libs/partners/openai/langchain_openai/chat_models/base.py#L1226

It's good to understand the rational behind (rather than no-brain copy paste). Great callout, Ben!

@B-Step62 B-Step62 merged commit ff1a60b into langchain-ai:main Oct 18, 2024
8 checks passed
@B-Step62 B-Step62 deleted the structured-output branch October 18, 2024 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add structured_output method support
2 participants