-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracking token usage? #29
Comments
Thanks for the kind words! I should note that often times the internals of the LLM providers (in particular OpenAI) are a bit of a mystery, so Kani's token counting is really just a best guess to within a couple of percent. You have a couple options if you want to track tokens as accurately as possible, which I'll lay out here:
I'll have to think a bit more about how to implement an official token counting interface if we decide to - maybe |
Wonderful, thank you! Post-hoc is fine for my case, I used your first suggestion and it works great. I did need to remember to update the counts manually when calling out to sub-kanis. (Maybe engine-level counting?) |
Hello again :) I've started playing with the newer streaming functionality, and it's been nice! However, it looks like Thanks as always- |
Good call out - this is a little less elegant with streaming. The In the current version (1.0.1), your best option is probably to look at
example: stream = ai.chat_round_stream("What is the airspeed velocity of an unladen swallow?")
async for token in stream:
print(token, end="")
completion = await stream.completion()
prompt_tokens = completion.prompt_tokens
completion_tokens = completion.completion_tokens
# ...
# msg = await stream.message() In v1.0.0 I added a private I'll update this thread with new code snippets (probably later today?) once that's done. |
As of class TokenCountingKani(Kani):
# ...
async def add_completion_to_history(self, completion):
prompt_tokens = completion.prompt_tokens
completion_tokens = completion.completion_tokens
# ...
return await super().add_completion_to_history(completion) Note that |
Wonderful, I will give it a try, thank you! |
I see that the API supports
.message_token_len()
for an individual ChatMessage; it would be nice to be able query total token usage over the course of a conversation for cost tracking purposes.I'm not entirely sure the best way to handle it - maybe like a
.next_message_tokens_cost(message: ChatMessage)
that would return the total prompt tokens (system + function defs + chat history) plus the tokens inmessage
that would be incurred? If it could be done over the course of a chat (accumulating after each full round) maybe something like.conversation_history_total_prompt_tokens()
and.conversation_history_total_response_tokens()
so a user could compute a running chat cost?Thanks for considering, and for developing Kani! It really is the 'right' API interface to tool-enabled LLMs in my opinion :)
The text was updated successfully, but these errors were encountered: