Catch token count issue while streaming with customized models #3241

BeibinLi · 2024-07-28T23:11:59Z

If llama, llava, phi, or some other models are used for streaming (with stream=True), the current design would crash after fetching the response.

A warning is enough in this case, just like the non-streaming use cases.

Why are these changes needed?

Related issue number

Checks

I've included any doc changes needed for https://microsoft.github.io/autogen/. See https://microsoft.github.io/autogen/docs/Contribute#documentation to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

If llama, llava, phi, or some other models are used for streaming (with stream=True), the current design would crash after fetching the response. A warning is enough in this case, just like the non-streaming use cases.

codecov-commenter · 2024-07-28T23:19:15Z

Codecov Report

Attention: Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Project coverage is 21.29%. Comparing base (6aaa238) to head (7d1a110).
Report is 4 commits behind head on main.

Files with missing lines	Patch %	Lines
autogen/oai/client.py	0.00%	5 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #3241       +/-   ##
===========================================
- Coverage   33.24%   21.29%   -11.95%     
===========================================
  Files          99       99               
  Lines       11016    11020        +4     
  Branches     2365     2537      +172     
===========================================
- Hits         3662     2347     -1315     
- Misses       7026     8507     +1481     
+ Partials      328      166      -162

Flag	Coverage Δ
unittests	`21.26% <0.00%> (-11.99%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

autogen/oai/client.py

Catch token count issue while streaming with customized models

a40aa56

If llama, llava, phi, or some other models are used for streaming (with stream=True), the current design would crash after fetching the response. A warning is enough in this case, just like the non-streaming use cases.

BeibinLi had a problem deploying to openai1 July 28, 2024 23:12 — with GitHub Actions Failure

sonichi reviewed Jul 29, 2024

View reviewed changes

autogen/oai/client.py Outdated Show resolved Hide resolved

sonichi requested review from olgavrou, yiranwu0 and marklysze July 29, 2024 14:43

Only catch not implemented error

c27f0a9

BeibinLi had a problem deploying to openai1 July 29, 2024 18:26 — with GitHub Actions Failure

jackgerrits had a problem deploying to openai1 September 25, 2024 14:23 — with GitHub Actions Failure

jackgerrits enabled auto-merge September 25, 2024 14:23

jackgerrits had a problem deploying to openai1 September 25, 2024 14:23 — with GitHub Actions Failure

Merge branch 'main' into stream-token-count

86b9089

jackgerrits had a problem deploying to openai1 September 25, 2024 14:56 — with GitHub Actions Failure

jackgerrits added this pull request to the merge queue Sep 25, 2024

Merged via the queue into main with commit ece6924 Sep 25, 2024
145 of 157 checks passed

jackgerrits deleted the stream-token-count branch September 25, 2024 15:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catch token count issue while streaming with customized models #3241

Catch token count issue while streaming with customized models #3241

BeibinLi commented Jul 28, 2024

codecov-commenter commented Jul 28, 2024 •

edited

Loading

Catch token count issue while streaming with customized models #3241

Catch token count issue while streaming with customized models #3241

Conversation

BeibinLi commented Jul 28, 2024

Why are these changes needed?

Related issue number

Checks

codecov-commenter commented Jul 28, 2024 • edited Loading

Codecov Report

codecov-commenter commented Jul 28, 2024 •

edited

Loading