[design] Speech to text and text to speech nodes in pipelines #1136

snopoke · 2025-02-04T13:42:36Z

Related to #1137

Write up a proposal for how to handle TTS and STT in pipelines.

In legacy bots it is handled in the Channels (https://github.ctom/dimagi/open-chat-studio/blob/08b37e392b229a4d6ff432347da403679881def0/apps/chat/channels.py#L430 &

open-chat-studio/apps/chat/channels.py

Line 396 in 08b37e3

def _reply_voice_message(self, text: str):

).

This approach still works with pipelines but the pipeline should be responsible for handling this to make it more flexible - maybe you want to use Whisper or maybe you are using a multi-modal modal that will accept the attachment directly

snopoke added the pipelines Issue is related to pipelines label Feb 4, 2025

snopoke moved this to Prioritized in OpenChatStudio Feb 4, 2025

snopoke added this to OpenChatStudio Feb 4, 2025

snopoke changed the title ~~Speech to text and text to speech nodes in pipelines~~ [design] Speech to text and text to speech nodes in pipelines Feb 7, 2025

snopoke mentioned this issue Feb 7, 2025

Support for multi-modal LLMs #1137

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[design] Speech to text and text to speech nodes in pipelines #1136

[design] Speech to text and text to speech nodes in pipelines #1136

snopoke commented Feb 4, 2025 •

edited

Loading

[design] Speech to text and text to speech nodes in pipelines #1136

[design] Speech to text and text to speech nodes in pipelines #1136

Comments

snopoke commented Feb 4, 2025 • edited Loading

snopoke commented Feb 4, 2025 •

edited

Loading