Clarification on the system prompt for custom tool use #36

ricklamers · 2024-08-02T11:12:23Z

Awesome work! Just a quick question about the correct system prompt:

in the docs https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1#user-defined-custom-tool-calling this is used:

If a you choose to call a function ONLY reply in the following format:
<{start_tag}={function_name}>{parameters}{end_tag}
where

start_tag => `<function`
parameters => a JSON dict with the function argument name as key and function argument value as value.
end_tag => `</function>`

Here is an example,
<function=example_function_name>{"example_name": "example_value"}</function>

Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line"
- Always add your sources when using search results to answer the user query

You are a helpful Assistant.

While in the repo this is used:

Think very carefully before calling functions.
If you choose to call a function ONLY reply in the following format with no prefix or suffix:

<function=example_function_name>{{"example_name": "example_value"}}</function>

Reminder:
- If looking for real time information use relevant functions before falling back to brave_search
- Function calls MUST follow the specified format, start with <function= and end with </function>
- Required parameters MUST be specified
- Only call one function at a time
- Put the entire function call reply on one line

Furthermore, could you clarify if the "Only call one function at a time" implies parallel tool use is not intended to be used for these instruction tuned models (Llama 3.1 family)?

e.g. "Please get the weather for San Francisco and Tokyo" can't generate:

<|start_header_id|>assistant<|end_header_id|>

<function=get_weather>{"location": "San Francisco"}</function>
<function=get_weather>{"location": "Tokyo"}</function><|eot_id|>

Thanks for clarifying!

Rick Lamers
AI Researcher at Groq

The text was updated successfully, but these errors were encountered:

HamidShojanazeri · 2024-08-05T19:41:42Z

cc: @ashwinb

ashwinb · 2024-08-05T20:13:19Z

@ricklamers thanks for pointing out the discrepancy. Please use the version as specified in the code / this repo. We will update our documentation to match the version from the code.

Re: parallel tool calling, we are doing a couple quick experiments and will get back to you on that ASAP.

ricklamers · 2024-08-06T09:54:19Z

Awesome, thanks!

ricklamers · 2024-08-07T17:36:19Z

@ashwinb FYI in HF's chat template yet another prompt is used:
https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct/blob/8c22764a7e3675c50d4c7c9a4edb474456022b16/tokenizer_config.json#L2053

Is that wrong? Should it follow the one in this repo?

ashwinb · 2024-08-07T19:09:40Z

@ricklamers :( not happy with these inconsistencies. it is hard to say something is wrong given the general stochasticity with tool calling unfortunately.

all I will say is that this is the reason we put llama model template --name <...> as part of the llama CLI. so that's the definitive source our researchers generally recommend. Given rapid iteration times, sometimes these recommendations don't reach across all the folks that need to see it.

ricklamers · 2024-08-07T21:40:09Z

No worries, as long as we know the correct system prompt (this repo) we can all adjust to converge to the same correct version. Any updates on parallel calls?

ricklamers · 2024-08-07T21:42:39Z

I've put out a note for them https://huggingface.co/meta-llama/Meta-Llama-3.1-8B-Instruct/discussions/90

Rocketknight1 · 2024-08-08T12:57:53Z

Hey, Matt from Hugging Face here. Just to clarify, the HF template was written following the "JSON based tool calling" template in this doc, and the prompt we used was also copied from the example prompt there.

Based on this discussion, am I correct that the <function> format in this repo is preferred, and we shouldn't use JSON tool calling? If so, I should rewrite the whole template to use that instead, rather than just updating the system prompt.

Imbernoulli · 2024-08-11T04:46:31Z

So which template is preferred? The function one or the json one? They are both at https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1/

hardikjshah · 2024-08-12T17:22:59Z

Both the json version and function version work reasonably well. We observed that the json one tends to over steer to using tools even when one is not asked for while with <function we were able to control that a bit more. Json version had higher recall but high false positives while <function had lower recall with higher precision. So tbh its a bit use case specific and I'd suggest you try both and see which works best for you.

Rocketknight1 · 2024-08-12T17:35:04Z

Unfortunately, we kind of have to pick one for the template! One thing we noticed is that with the current JSON template, 8B makes tool calls correctly, but sometimes fails to use the results correctly in chat - not sure if this is an issue with the system message we used, since it was all copied from the doc.

My suspicion is that an alternate prompt would fix a lot of this, and we'd prefer to have a clear answer on the best way to do things rather than several options!

hardikjshah · 2024-08-15T20:37:13Z

We updated the default tool format to be json based and recommend following that.

#45
meta-llama/llama-stack#29
meta-llama/llama-models#110

The code also supports the <function> format and can be extended to support other formats in the future if needed.
Working with other teams to reconcile and update the website to reflect these changes.

Use this command to get the latest recommended format,

llama model template --name system-custom-tools-only

Some caveats,

The model sometimes responds with multiple tool calls albeit inconsistently, so for now we only support one tool call at the system level. We are working on making both the models and system better to support this use case.
With the json format, at times the model tends to be overly steered to always make tool calls even when not asked for.

Hope this helps resolve the confusion. Again, thanks for raising these issues, it helps us get better and improve with each version.

Watebear · 2024-08-19T15:37:11Z

Hello, I attempted to use the prompt concatenation method mentioned above to test BFCL, but the AST SUMMARY only achieved 50.82%. Below is an example of the input I constructed.

Could you provide an input format that can reproduce the evaluation results from the report?
Thanks！

el-hash-1 · 2024-08-23T21:21:31Z

@hardikjshah I think the llama model template --name assistant-custom-tool-call would also need to be updated to json format.

pcuenca · 2024-10-09T17:23:14Z

Hello, this is Pedro from Hugging Face. I've been trying today to verify the tool calling template that is in use for the Llama 3.2 models. My approach was to start with the documentation provided by llama model prompt-format -m Llama3.2-1B-Instruct, but also debug the llama stack run server and examine the actual inputs that go into the model. I have a couple of questions:

The output from a tool calling step is (for example) [get_weather(city="San Francisco", metric="celsius")]<|eot_id|>. This is appended to the previous prompt with an ipython role, but the token <|python_tag|> is also prepended to it, even though it was not generated by the model. Is this intended? To be clear, this goes into the model: <|python_tag|>[get_weather(city="San Francisco", metric="celsius")]<|eot_id|>
The system instructions (for example, "You are a helpful assistant") are present in the first turn of the conversation after the tool definitions, but they are skipped in subsequent turns. I used the mesop-based chat UI with a slightly modified version of chat_with_custom_tools.py.
llama model prompt-format -m Llama3.2-1B-Instruct uses this to finalize the custom tool definition in the system prompt and start the user turn: ]<|eot_id|><|start_header_id|>user<|end_header_id|>. However, I'm seeing a newline in actual server use:

]
<|eot_id|><|start_header_id|>user<|end_header_id|>

or the following, when the additional instructions are honored:

]
You are a helpful assistant<|eot_id|><|start_header_id|>user<|end_header_id|>

Should a newline separator be used after the tool definitions?

Based on existing tool examples, I'm assuming the output must be a JSON string. If this is the case, in my tests the model sees this input: "\"25 C\"" rather than the "25 C" indicated in the prompt-format documentation. Are there any other serialization examples?

The reason I'm asking these questions is I've found the use of tools fragile in the small 3.2 text models. We'd like to reduce the ambiguity as much as possible, and provide a validated chat template to the community so developers can experiment with confidence.

zoubaobao · 2024-10-15T01:50:53Z

Hello, I attempted to use the prompt concatenation method mentioned above to test BFCL, but the AST SUMMARY only achieved 50.82%. Below is an example of the input I constructed. Could you provide an input format that can reproduce the evaluation results from the report? Thanks！

Same. I tried this model on BFCL simple test, and I only got 40% accuracy. Llama-3.1-8B-Instruct tool-use ability is not as good as they show.

edmcman · 2024-11-13T16:42:13Z

@hardikjshah It doesn't seem like this command is valid anymore: llama model template --name system-custom-tools-only

See https://colab.research.google.com/drive/1JCPiY8pvP6ZGG2xGvJzrnx7ntfrPeCHX?usp=sharing

init27 mentioned this issue Aug 12, 2024

Prompt in jinja and interface.py inconsistent meta-llama/llama-models#55

Open

pcuenca mentioned this issue Oct 11, 2024

Tool calling huggingface/huggingface-llama-recipes#29

Open

3 tasks

edmcman mentioned this issue Nov 13, 2024

Tools support is not working right ollama/ollama#6980

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on the system prompt for custom tool use #36

Clarification on the system prompt for custom tool use #36

ricklamers commented Aug 2, 2024 •

edited

Loading

HamidShojanazeri commented Aug 5, 2024

ashwinb commented Aug 5, 2024

ricklamers commented Aug 6, 2024

ricklamers commented Aug 7, 2024

ashwinb commented Aug 7, 2024

ricklamers commented Aug 7, 2024

ricklamers commented Aug 7, 2024

Rocketknight1 commented Aug 8, 2024

Imbernoulli commented Aug 11, 2024

hardikjshah commented Aug 12, 2024

Rocketknight1 commented Aug 12, 2024

hardikjshah commented Aug 15, 2024

Watebear commented Aug 19, 2024

el-hash-1 commented Aug 23, 2024 •

edited

Loading

pcuenca commented Oct 9, 2024 •

edited

Loading

zoubaobao commented Oct 15, 2024

edmcman commented Nov 13, 2024

Clarification on the system prompt for custom tool use #36

Clarification on the system prompt for custom tool use #36

Comments

ricklamers commented Aug 2, 2024 • edited Loading

HamidShojanazeri commented Aug 5, 2024

ashwinb commented Aug 5, 2024

ricklamers commented Aug 6, 2024

ricklamers commented Aug 7, 2024

ashwinb commented Aug 7, 2024

ricklamers commented Aug 7, 2024

ricklamers commented Aug 7, 2024

Rocketknight1 commented Aug 8, 2024

Imbernoulli commented Aug 11, 2024

hardikjshah commented Aug 12, 2024

Rocketknight1 commented Aug 12, 2024

hardikjshah commented Aug 15, 2024

Watebear commented Aug 19, 2024

el-hash-1 commented Aug 23, 2024 • edited Loading

pcuenca commented Oct 9, 2024 • edited Loading

zoubaobao commented Oct 15, 2024

edmcman commented Nov 13, 2024

ricklamers commented Aug 2, 2024 •

edited

Loading

el-hash-1 commented Aug 23, 2024 •

edited

Loading

pcuenca commented Oct 9, 2024 •

edited

Loading