Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
How do we feel about changing a utility function's return type from
bool
toint
? This should not be a breaking change for any actual code, modulo type annotations complaining, perhaps?This PR fixes the fact that we were looking for end-of-sentence patterns anywhere in a string, but not splitting the string. This worked well for GPT-4 models, which seem to always send chunks that break cleanly on sentence boundaries. But other models don't necessarily do that. So we were sending The first word or even first word fragment through to the TTS service, playing havoc with prosody.
I also changed the
_push_tts_frames()
code so that we are not stripping whitespace from the beginning and end of each chunk we send to the TTS. That was fine when all TTS models were non-stateful invocations via HTTP. But now that we can stream to several of the TTS services, whitespace can be important for prosody.