Replies: 2 comments 3 replies
-
To convert voice to text using LangChain and the OpenAI API, you can use the import base64
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
model="gpt-4o-audio-preview",
temperature=0,
)
with open("audio_input.wav", "rb") as f:
audio = f.read()
audio_b64 = base64.b64encode(audio).decode()
output_message = llm.invoke(
[
(
"human",
[
{"type": "text", "text": "Transcribe the following:"},
{
"type": "input_audio",
"input_audio": {"data": audio_b64, "format": "wav"},
},
],
),
]
)
print(output_message.content) This script reads an audio file, encodes it in base64, and sends it to the OpenAI API for transcription using LangChain [1]. If you're interested in using LangChain with Azure AI Services, you can use the To continue talking to Dosu, mention @dosu. Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other |
Beta Was this translation helpful? Give feedback.
-
@dosu. Can you also give me similar way 'text to voice files' code please? |
Beta Was this translation helpful? Give feedback.
-
In my POC, I have a need to convert voice to text. Can you give me some sample scripts by using Langchain and OpenAI API or LangGraph?
Thanks for your help.
Beta Was this translation helpful? Give feedback.
All reactions