-
Notifications
You must be signed in to change notification settings - Fork 159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vertexai: refactor: simplify content processing in anthropic formatter #601
vertexai: refactor: simplify content processing in anthropic formatter #601
Conversation
@jfypk Let's update the description a bit more with what this is actually fixing, including some details. In my opinion, this is a bugfix, not just a refactor. We noticed that when we were switching from using Langchain+Anthropic to Langchain+Anthropic-on-Vertex, that we'd sometimes get failures like this:
We noticed this in this Merge Request: https://gitlab.com/gitlab-org/duo-workflow/duo-workflow-service/-/merge_requests/59. We worked around this by patching This should fix that discrepancy that we think is a bug. |
@@ -205,14 +205,18 @@ def _format_params( | |||
|
|||
def _format_output(self, data: Any, **kwargs: Any) -> ChatResult: | |||
data_dict = data.model_dump() | |||
content = [c for c in data_dict["content"] if c["type"] != "tool_use"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the main difference, here, we would only have elements in content
for which the type was not tool_use
. This would break when there is only one element, and that element had the this type set. Causing content
to be empty on L215
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome -- thanks for calling this out @Reprazent! updated the description as well.
@@ -205,14 +205,18 @@ def _format_params( | |||
|
|||
def _format_output(self, data: Any, **kwargs: Any) -> ChatResult: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we add a test for _format_output
similar to the test that we have in the main lanchain library: https://github.com/langchain-ai/langchain/blob/f1222739f88bfdf37513af146da6b9dbf2a091c4/libs/partners/anthropic/tests/unit_tests/test_chat_models.py#L87-L109
We should base the input for the test here on what we noticed was causing the errors. So instead of passing only a TextBlock
, we should only pass a ToolUseBlock
like we do in our test. I suspect this will fail in the previous implementation.
What do you think?
hi @lkuligin -- Do you know how I can resolve the following errors? FAILED tests/integration_tests/test_maas.py::test_stream[mistral-nemo@2407] They seem unrelated to my changes and related to the environment. Thanks! |
@jfypk please, look at failing unit tests:
|
@@ -1067,3 +1068,49 @@ def test_init_client_with_custom_api() -> None: | |||
transport = mock_prediction_service.call_args.kwargs["transport"] | |||
assert client_options.api_endpoint == "https://example.com" | |||
assert transport == "rest" | |||
|
|||
@pytest.mark.requires("anthropic") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure we support requires marker. Probably, it's easier just to add it to the test group (we need to install it anyway to run the whole test suite).
5c74842
to
e3906b8
Compare
e3906b8
to
ff2f9c0
Compare
hi @lkuligin -- thanks for the help on the failing unit test. For these integration tests, is there a gcloud environment that I can use to test these integration tests locally? Spinning up my own is proving to take a long time. It looks like the failing tests revolve around the structure of the outputs from the mistral and meta llms. appreciate any pointers! |
@jfypk just ignore failing integration tests for maas models, please. Something is happening there, I need to investigate, but it's not related to your PR (and these tests are not part of the release tests). You can call linters and unit testing locally though as described in the contribution guide: |
@lkuligin Thanks for merging! Is there a cycle at which you release these? So we know when we can update our dependency in https://gitlab.com/gitlab-org/duo-workflow/duo-workflow-service/-/work_items/155. cc @jfypk |
let me fix some tests and release in the beginning of the next week, would that work for you? or do you need it urgently? |
@lkuligin That sounds great! We can wait until next week, thank you. |
PR Description
This PR improves content processing in the Anthropic formatter to handle content blocks more elegantly, while also fixing a bug we noticed from switching from Langchain+Anthropic to Langchain+Anthropic-on-Vertex:
{'type': 'invalid_request_error', 'message': 'messages.16:
tool_result
block(s) provided when previous message does not contain anytool_use
blocks'}We noticed this in this Merge Request: https://gitlab.com/gitlab-org/duo-workflow/duo-workflow-service/-/merge_requests/59. We worked around this by patching ChatAnthropicVertex._format_output, with the way ChatAnthropic._format_output is implemented in the main langchain library.
Key changes:
Preserves content structure for multi-block responses
Simplified logic for single text content handling
Maintains tool calls functionality with cleaner code
No breaking changes to external interfaces
All existing tests pass without modification
The changes support better flexibility in content handling while keeping code maintainable.
Current tests already verify core functionality.
Relevant issues
Type
🐛 Bug Fix
🧹 Refactoring
Changes(optional)
Modifies the
_format_output
method to be more in line with the Anthropic implementation https://github.com/langchain-ai/langchain/blob/e317d457cfe8586e4006b5f41e2c4f1e18a4d58c/libs/partners/anthropic/langchain_anthropic/chat_models.py#L753C5-L778C10Testing(optional)
Current tests already verify core functionality.
Note(optional)