Catch token count issue while streaming with customized models #3241

If llama, llava, phi, or some other models are used for streaming (with stream=True), the current design would crash after fetching the response. A warning is enough in this case, just like the non-streaming use cases.

Commits on Aug 6, 2024

Merge branch 'main' into stream-token-count

sonichi authored Aug 6, 2024

Configuration menu

View commit details

Copy full SHA for 286d647

Browse repository at this point

Copy the full SHA

286d647 View commit details

Browse the repository at this point in the history

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Catch token count issue while streaming with customized models #3241

Catch token count issue while streaming with customized models #3241

Commits on Jul 28, 2024

Commits on Jul 29, 2024

Commits on Aug 6, 2024

Commits on Sep 25, 2024