Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is tool_calls from the README.md supposed to work? #121

Open
a2f0 opened this issue Feb 13, 2025 · 4 comments
Open

is tool_calls from the README.md supposed to work? #121

a2f0 opened this issue Feb 13, 2025 · 4 comments

Comments

@a2f0
Copy link

a2f0 commented Feb 13, 2025

This screenshot is from dbca475 on main of this repo. I'm unable to get TypeScript to destructure the call to await context?.completion to obtain either text or tool_calls. I originally tried this on our own repo's implementation of llama.rn, but eventually tried it on the llama.rn repo and see the same result.

I suppose my question / issue is: the config in the README.me here supposed to work?

Here is what I see in the llama.rn repo, as well as our own:

Image

@jhen0409
Copy link
Member

The type definition is missing and should be easy to fix.

It depends on model and chat_template to see how readme example works for tool_calls, can you share the model you use?

@a2f0
Copy link
Author

a2f0 commented Feb 14, 2025

The type definition is missing and should be easy to fix.

It depends on model and chat_template to see how readme example works for tool_calls, can you share the model you use?

Thanks for the speedy reply @jhen0409 (and for this great project!): We use a few different models:

  1. tinyllama 1.1 - this doesn't work very well, we only use it on devices with 4GB or less of memory
  2. deepseek r1 qwen distill 1.5b
  3. Phi 3 mini 4k instruct q4

I'm currently testing on an iPhone 16 Pro with 8GB of memory. I see what you are describing the model-specific considerations for the chat_template, and that it might impact this functionality. If you have a model in mind that you think should 'just work' for the tool calling on a device with this memory I'd be happy to evaluate that instead of something on the list above.

@jhen0409
Copy link
Member

I've tested tinyllama 1.1 and the Phi 3 model in example, and it works as expected:

 LOG  completionResult:  {"completion_probabilities": [], "stopped_eos": true, "stopped_limit": false, "stopped_word": true, "stopping_word": "</s>", "text": "{
  \"tool_call\": {
    \"name\": \"ipython\",
    \"arguments\": {
      \"code\": \"print('hello world!')\"
    }
  }
}
", "timings": {"predicted_ms": 976.374, "predicted_n": 46, "predicted_per_second": 47.11309395784812, "predicted_per_token_ms": 21.225521739130436, "prompt_ms": 408.51300000000003, "prompt_n": 196, "prompt_per_second": 479.78889288712963, "prompt_per_token_ms": 2.0842500000000004}, "tokens_cached": 143, "tokens_evaluated": 196, "tokens_predicted": 46, "tool_calls": [{"function": [Object], "id": null, "type": "function"}], "truncated": true}

These two models aren't native supported tool calls, so it use the generic tool call method.

The deepseek r1 model is native supported tool calls, so it will be better to decide when to use tool_call, but it currently has some issues. After ggml-org/llama.cpp#11607 (need sync later), I think it should be work, I'll do more tests for that.

@a2f0
Copy link
Author

a2f0 commented Feb 17, 2025

@jhen0409 thanks for your eyes on this. I confirmed that I am able to receive the tool_call property in the completionResult using Phi 3. In case anybody is interested here is the prompt that I used "give me an example of something that I can use on an iPython interpreter". It produced the following completionResult:

  \"tool_call\": {
    \"name\": \"ipython\",
    \"arguments\": {
      \"code\": \"import numpy as np\\n\\narr = np.array([1, 2, 3, 4, 5])\\nprint('Mean:', np.mean(arr))\\nprint('Standard Deviation:', npstdev(arr))\"
    }
  }
} "

I see that you also recently added the missing TS definition for tool_calls in the NativeCompletionResult in 6572fc7. I appreciate your help with this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants