Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(drivers-prompt-openai):add audio input/output support to OpenAiChatPromptDriver #1617

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

collindutter
Copy link
Member

@collindutter collindutter commented Jan 24, 2025

Describe your changes

Added

  • Support for AudioArtifact inputs/outputs in OpenAiChatPromptDriver.

Issue ticket number and link

Closes #1601

Copy link

codecov bot commented Jan 29, 2025

Codecov Report

Attention: Patch coverage is 96.55172% with 3 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...iptape/drivers/prompt/openai_chat_prompt_driver.py 93.33% 0 Missing and 2 partials ⚠️
griptape/tasks/prompt_task.py 80.00% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

@collindutter collindutter force-pushed the feature/openai-audio branch 2 times, most recently from 5f9394f to 73e246a Compare January 30, 2025 00:03
@collindutter
Copy link
Member Author

collindutter commented Jan 30, 2025

Still need docs, code is ready for review though.

@collindutter collindutter requested a review from vachillo January 30, 2025 00:15
@collindutter collindutter marked this pull request as ready for review January 30, 2025 00:15
@collindutter collindutter force-pushed the feature/openai-audio branch 2 times, most recently from 442d68f to f26f6b3 Compare February 4, 2025 17:53
@collindutter collindutter force-pushed the feature/openai-audio branch 2 times, most recently from 37f793f to f023372 Compare February 5, 2025 02:02
@collindutter collindutter changed the title OpenAi Audio Inputs/Outputs feat(drivers-prompt-openai):add audio input/output support to OpenAiChatPromptDriver Feb 5, 2025
@collindutter collindutter force-pushed the feature/openai-audio branch 2 times, most recently from bc9a49c to 85e5497 Compare February 7, 2025 16:19
@@ -177,6 +180,8 @@ def __process_stream(self, prompt_stack: PromptStack) -> Message:
delta_contents[content.index] = [content]
if isinstance(content, TextDeltaMessageContent):
EventBus.publish_event(TextChunkEvent(token=content.text, index=content.index))
elif isinstance(content, AudioDeltaMessageContent) and content.data is not None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what would be the scenario where data is None?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAi sometimes sends chunks with no content. For example, a final chunk that contains the usage.

Comment on lines 173 to 174
"modalities": self.modalities,
"audio": self.audio,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should these follow the same **(...) syntax?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They get ignored when using non-audio models. But I can see openai-compatible endpoints throwing a fit. I'll make conditional.

elif (
isinstance(content, AudioMessageContent)
and message.is_assistant()
and time.time() < content.artifact.meta.get("expires_at", float("inf"))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the scenario where there is no expires_at? are we just assuming no expires_at == try it anyway and see if its expired or not?

artifact = content.artifact

# We can't send the audio if it's expired.
if int(time.time()) > artifact.meta.get("expires_at", float("inf")):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opposite logic here. wondering if we should just always try to send the transcript if there is no expires_at. i also probably dont understand when certain fields are set or unset, and if they are allowed to be None or not

@@ -257,7 +265,7 @@ def __init_from_artifact(self, artifact: TextArtifact | ListArtifact) -> None:
None
"""
# When using native tools, we can assume that a TextArtifact is the LLM providing its final answer.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support OpenAi Audio Inputs/Outputs
2 participants