vertexai: refactor: simplify content processing in anthropic formatter #601

jfypk · 2024-11-14T17:29:05Z

PR Description

This PR improves content processing in the Anthropic formatter to handle content blocks more elegantly, while also fixing a bug we noticed from switching from Langchain+Anthropic to Langchain+Anthropic-on-Vertex:

{'type': 'invalid_request_error', 'message': 'messages.16: tool_result block(s) provided when previous message does not contain any tool_use blocks'}

We noticed this in this Merge Request: https://gitlab.com/gitlab-org/duo-workflow/duo-workflow-service/-/merge_requests/59. We worked around this by patching ChatAnthropicVertex._format_output, with the way ChatAnthropic._format_output is implemented in the main langchain library.

Key changes:

Preserves content structure for multi-block responses
Simplified logic for single text content handling
Maintains tool calls functionality with cleaner code
No breaking changes to external interfaces
All existing tests pass without modification

The changes support better flexibility in content handling while keeping code maintainable.

Current tests already verify core functionality.

Relevant issues

Type

🐛 Bug Fix
🧹 Refactoring

Changes(optional)

Modifies the _format_output method to be more in line with the Anthropic implementation https://github.com/langchain-ai/langchain/blob/e317d457cfe8586e4006b5f41e2c4f1e18a4d58c/libs/partners/anthropic/langchain_anthropic/chat_models.py#L753C5-L778C10

Testing(optional)

Current tests already verify core functionality.

Note(optional)

Reprazent · 2024-11-14T18:43:38Z

@jfypk Let's update the description a bit more with what this is actually fixing, including some details. In my opinion, this is a bugfix, not just a refactor.

We noticed that when we were switching from using Langchain+Anthropic to Langchain+Anthropic-on-Vertex, that we'd sometimes get failures like this:

{'type': 'invalid_request_error', 'message': 'messages.16: `tool_result` block(s) provided when previous message does not contain any `tool_use` blocks'}

We noticed this in this Merge Request: https://gitlab.com/gitlab-org/duo-workflow/duo-workflow-service/-/merge_requests/59. We worked around this by patching ChatAnthropicVertex._format_output, with the way ChatAnthropic._format_output is implemented in the main langchain library.

This should fix that discrepancy that we think is a bug.

Reprazent · 2024-11-14T18:45:21Z

libs/vertexai/langchain_google_vertexai/model_garden.py

@@ -205,14 +205,18 @@ def _format_params(

    def _format_output(self, data: Any, **kwargs: Any) -> ChatResult:
        data_dict = data.model_dump()
-        content = [c for c in data_dict["content"] if c["type"] != "tool_use"]


This is the main difference, here, we would only have elements in content for which the type was not tool_use. This would break when there is only one element, and that element had the this type set. Causing content to be empty on L215

awesome -- thanks for calling this out @Reprazent! updated the description as well.

Reprazent · 2024-11-14T18:55:52Z

libs/vertexai/langchain_google_vertexai/model_garden.py

@@ -205,14 +205,18 @@ def _format_params(

    def _format_output(self, data: Any, **kwargs: Any) -> ChatResult:


Shall we add a test for _format_output similar to the test that we have in the main lanchain library: https://github.com/langchain-ai/langchain/blob/f1222739f88bfdf37513af146da6b9dbf2a091c4/libs/partners/anthropic/tests/unit_tests/test_chat_models.py#L87-L109

We should base the input for the test here on what we noticed was causing the errors. So instead of passing only a TextBlock, we should only pass a ToolUseBlock like we do in our test. I suspect this will fail in the previous implementation.

What do you think?

jfypk · 2024-11-16T04:14:09Z

hi @lkuligin -- Do you know how I can resolve the following errors?

FAILED tests/integration_tests/test_maas.py::test_stream[mistral-nemo@2407]
FAILED tests/integration_tests/test_maas.py::test_stream[mistral-large@2407]
FAILED tests/integration_tests/test_maas.py::test_astream[mistral-nemo@2407]
FAILED tests/integration_tests/test_maas.py::test_astream[mistral-large@2407]
FAILED tests/integration_tests/test_maas.py::test_tools[meta/llama-3.2-90b-vision-instruct-maas]
FAILED tests/integration_tests/test_vectorstores.py::test_document_storage[datastore_document_storage]

They seem unrelated to my changes and related to the environment.

Thanks!

lkuligin · 2024-11-19T12:24:13Z

@jfypk please, look at failing unit tests:

I assume we need to add a missing dependency to the test group
if the model hits Google APIs to validate for credentials during initialization, then we need to mock these calls since unit tests runner doesn't have required auth

lkuligin · 2024-11-19T18:30:14Z

libs/vertexai/tests/unit_tests/test_chat_models.py

@@ -1067,3 +1068,49 @@ def test_init_client_with_custom_api() -> None:
        transport = mock_prediction_service.call_args.kwargs["transport"]
        assert client_options.api_endpoint == "https://example.com"
        assert transport == "rest"
+
+@pytest.mark.requires("anthropic")


I'm not sure we support requires marker. Probably, it's easier just to add it to the test group (we need to install it anyway to run the whole test suite).

jfypk · 2024-11-20T02:05:14Z

hi @lkuligin -- thanks for the help on the failing unit test.

For these integration tests, is there a gcloud environment that I can use to test these integration tests locally? Spinning up my own is proving to take a long time. It looks like the failing tests revolve around the structure of the outputs from the mistral and meta llms.

appreciate any pointers!

lkuligin · 2024-11-20T05:33:48Z

@jfypk just ignore failing integration tests for maas models, please. Something is happening there, I need to investigate, but it's not related to your PR (and these tests are not part of the release tests).

You can call linters and unit testing locally though as described in the contribution guide:
https://github.com/langchain-ai/langchain-google?tab=readme-ov-file#local-development-dependencies

Reprazent · 2024-11-20T15:18:19Z

@lkuligin Thanks for merging! Is there a cycle at which you release these? So we know when we can update our dependency in https://gitlab.com/gitlab-org/duo-workflow/duo-workflow-service/-/work_items/155.

cc @jfypk

lkuligin · 2024-11-20T16:21:26Z

let me fix some tests and release in the beginning of the next week, would that work for you? or do you need it urgently?

Reprazent · 2024-11-20T16:38:49Z

@lkuligin That sounds great! We can wait until next week, thank you.

lkuligin approved these changes Nov 14, 2024

View reviewed changes

Reprazent reviewed Nov 14, 2024

View reviewed changes

lkuligin approved these changes Nov 16, 2024

View reviewed changes

lkuligin reviewed Nov 19, 2024

View reviewed changes

jfypk force-pushed the jfypk/main/update-format-output branch from 5c74842 to e3906b8 Compare November 19, 2024 19:58

refactor: simplify content processing in anthropic formatter

ff2f9c0

jfypk force-pushed the jfypk/main/update-format-output branch from e3906b8 to ff2f9c0 Compare November 19, 2024 20:03

chore: updating poetry.lock

f3173c4

lkuligin merged commit 0b5d16a into langchain-ai:main Nov 20, 2024
14 of 15 checks passed

jfypk deleted the jfypk/main/update-format-output branch November 20, 2024 14:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vertexai: refactor: simplify content processing in anthropic formatter #601

vertexai: refactor: simplify content processing in anthropic formatter #601

jfypk commented Nov 14, 2024 •

edited

Loading

Reprazent commented Nov 14, 2024

Reprazent Nov 14, 2024

jfypk Nov 14, 2024 •

edited

Loading

Reprazent Nov 14, 2024

jfypk commented Nov 16, 2024

lkuligin commented Nov 19, 2024

lkuligin Nov 19, 2024

jfypk commented Nov 20, 2024

lkuligin commented Nov 20, 2024

Reprazent commented Nov 20, 2024

lkuligin commented Nov 20, 2024

Reprazent commented Nov 20, 2024

		@@ -205,14 +205,18 @@ def _format_params(

		def _format_output(self, data: Any, **kwargs: Any) -> ChatResult:

vertexai: refactor: simplify content processing in anthropic formatter #601

vertexai: refactor: simplify content processing in anthropic formatter #601

Conversation

jfypk commented Nov 14, 2024 • edited Loading

PR Description

Relevant issues

Type

Changes(optional)

Testing(optional)

Note(optional)

Reprazent commented Nov 14, 2024

Reprazent Nov 14, 2024

Choose a reason for hiding this comment

jfypk Nov 14, 2024 • edited Loading

Choose a reason for hiding this comment

Reprazent Nov 14, 2024

Choose a reason for hiding this comment

jfypk commented Nov 16, 2024

lkuligin commented Nov 19, 2024

lkuligin Nov 19, 2024

Choose a reason for hiding this comment

jfypk commented Nov 20, 2024

lkuligin commented Nov 20, 2024

Reprazent commented Nov 20, 2024

lkuligin commented Nov 20, 2024

Reprazent commented Nov 20, 2024

jfypk commented Nov 14, 2024 •

edited

Loading

jfypk Nov 14, 2024 •

edited

Loading