together, standard-tests: specify tool_choice in standard tests #25548

ccurme · 2024-08-19T15:21:52Z

Here we allow standard tests to specify a value for tool_choice via a tool_choice_value property, which defaults to None.

Chat models available in Together have issues passing standard tool calling tests:

llama 3.1 models currently appear to rely on user-side parsing in Together;
Mixtral-8x7B and Mistral-7B (currently tested) consistently do not call tools in some tests.

Specifying tool_choice also lets us remove an existing xfail and use a smaller model in Groq tests.

vercel · 2024-08-19T15:21:56Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Skipped Deployment

Name	Status	Preview	Comments	Updated (UTC)
langchain	⬜️ Ignored (Inspect)	Visit Preview		Aug 19, 2024 8:04pm

baskaryan · 2024-08-19T19:45:06Z

libs/partners/groq/tests/integration_tests/test_standard.py

@@ -28,11 +28,22 @@ class TestGroqLlama(BaseTestGroq):
    @property
    def chat_model_params(self) -> dict:
        return {
-            "model": "llama-3.1-70b-versatile",
+            "model": "llama-3.1-8b-instant",


ooc why go smaller instead of bigger

We can do 70b (I changed it from 8b to 70b this morning). My thought is if they support the same features we should prefer smaller models for tests in spirit of testing functionality vs. benchmarking.

baskaryan · 2024-08-19T19:46:12Z

libs/standard-tests/langchain_standard_tests/integration_tests/chat_models.py

@@ -170,7 +170,14 @@ def test_stop_sequence(self, model: BaseChatModel) -> None:
    def test_tool_calling(self, model: BaseChatModel) -> None:
        if not self.has_tool_calling:
            pytest.skip("Test requires tool calling.")
-        model_with_tools = model.bind_tools([magic_function])
+        if self.tool_choice_value == "dict":


this format is openai-specific (eg anthropic doesn't support i dont think). i actually tihnk more models support just passing in tool name as a string to bind_tools, and think thats what we'd want to standardize on from a devx perspective anyways

Thanks, updated this to tool name.

ccurme added 3 commits August 19, 2024 10:55

update groq

ebf9cbc

update ChatTogether.bind_tools

06b62ee

specify tool_choice in tool calling tests

1b78a7b

efriis added the partner label Aug 19, 2024

efriis self-assigned this Aug 19, 2024

dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. langchain Related to the langchain package 🤖:improvement Medium size change to existing code to handle new use-cases labels Aug 19, 2024

ccurme removed the langchain Related to the langchain package label Aug 19, 2024

ccurme added 3 commits August 19, 2024 11:29

update docstring

57d6497

revert

0ed918a

allow specification of tool_choice_value

4470674

dosubot bot added size:M This PR changes 30-99 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Aug 19, 2024

add type hint

a68fa35

baskaryan reviewed Aug 19, 2024

View reviewed changes

ccurme added 2 commits August 19, 2024 15:54

dict -> tool name

7202cdd

updat

fc11b47

baskaryan approved these changes Aug 19, 2024

View reviewed changes

dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Aug 19, 2024

ccurme merged commit c5bf114 into master Aug 19, 2024
105 checks passed

ccurme deleted the cc/standard_tests branch August 19, 2024 20:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

together, standard-tests: specify tool_choice in standard tests #25548

together, standard-tests: specify tool_choice in standard tests #25548

ccurme commented Aug 19, 2024 •

edited

Loading

vercel bot commented Aug 19, 2024 •

edited

Loading

baskaryan Aug 19, 2024

ccurme Aug 19, 2024

baskaryan Aug 19, 2024

ccurme Aug 19, 2024

together, standard-tests: specify tool_choice in standard tests #25548

together, standard-tests: specify tool_choice in standard tests #25548

Conversation

ccurme commented Aug 19, 2024 • edited Loading

vercel bot commented Aug 19, 2024 • edited Loading

baskaryan Aug 19, 2024

Choose a reason for hiding this comment

ccurme Aug 19, 2024

Choose a reason for hiding this comment

baskaryan Aug 19, 2024

Choose a reason for hiding this comment

ccurme Aug 19, 2024

Choose a reason for hiding this comment

ccurme commented Aug 19, 2024 •

edited

Loading

vercel bot commented Aug 19, 2024 •

edited

Loading