Preemptive request for roadmap for audio streaming with gpt-4o when openai releases the api access to it. #21788
Duncan-Haywood
started this conversation in
Ideas
Replies: 1 comment
-
I want it |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Checked
Feature request
When the API access to the audio input and output interface for the gpt-4o model is released, I would appreciate support for audio input and audio streaming output.
Motivation
This is a needed use case for our company and is a big enough use case that we will implement in our code base outside of langchain if langchain can't support it. The reason is using one cheaper model to input and output audio rather than three (speech-to-text, LLM text-to-text, and text-to-speech). The new gpt-4o is capable of being a single go-between audio-to-audio, and it decreases latency from a few seconds to a few hundred milliseconds for such requests. This last note is the primary reason for the request: more responsive audio-to-audio use cases improving latency from 3 seconds to 200ms. We would still like to stay in the langchain ecosystem for traceability, integration with other tools, etc.
Proposal (If applicable)
No response
Beta Was this translation helpful? Give feedback.
All reactions