feat: Transcription Model support #322

yavens · 2025-02-25T02:48:15Z

Adds support for speech/audio-to-text generation models and integrates OpenAI's Whisper model.

Notes:

Enables reqwest multipart feature flag.

0xMochan

Looks pretty solid, couple of improvement suggestions. Also adds to an opportunity to explode openai into multiple files in a future PR ;)

0xMochan · 2025-02-25T04:54:21Z

rig-core/Cargo.toml

@@ -15,7 +15,7 @@ doctest = false
 # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

 [dependencies]
-reqwest = { version = "0.11.22", features = ["json", "stream"] }
+reqwest = { version = "0.11.22", features = ["json", "stream", "multipart"] }


Is there a case to make this optional? Would be good to include some analysis on what other deps this might include based on this feature being on.

I'll look into what it adds

Looks like the multipart feature flag only adds the mime_guess and mime libraries. Their dependencies are already in the project from other crates.

0xMochan · 2025-02-25T08:41:44Z

rig-core/src/transcription.rs

+    /// The file data to be sent to the transcription model provider
+    pub data: Vec<u8>,
+    /// The file name to be used in the request
+    pub filename: String,


I think this should be Option<String>. I understand it's necessary for multi-part, but in situations where you don't have files, it doesn't make sense to set a filename when building a transcription request. IMO, it's better for the developer to not include a filename or default to None when building and then use that to set a default when actually executing the request.

OR, use #[serde(default)] to have "" as the default in the request (so if the builder doesn't make a filename, you default.) I'm not sure which is better, so just make a case :)

Went with the former here

0xMochan · 2025-02-25T08:46:20Z

rig-core/src/transcription.rs

+    }
+
+    /// Load the specified file into data
+    pub fn load_file(self, path: &str) -> Self {


Suggested change

pub fn load_file(self, path: &str) -> Self {

pub fn load_file<P>(self, path: P) -> Self

where

P: AsRef<path::Path> {

This allows you to pass anything that can be a path which is flexible + it's actually generating a function per type of argument you provide. We could apply this technique to other places in the codebase, but I haven't fully explored the pros and cons (higher binary sizes code-readability vs flexibility and speed).

Many things in std do it this way!

Oh neat, didn't even think about that

yavens added 3 commits February 17, 2025 10:33

feat: transcription model support

844c28d

chore: Merge branch 'main' into feat/transcription

04fe1c1

chore: run cargo fmt

1137e10

yavens added the non-breaking label Feb 25, 2025

yavens requested a review from 0xMochan February 25, 2025 02:48

yavens added 2 commits February 24, 2025 21:55

fix: update doc strings and Transcription trait

77a3021

fix: &Vec<u8> -> &[u8]

bfb13d3

0xMochan requested changes Feb 25, 2025

View reviewed changes

fix: implement Mochan's suggestions

ecf06aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Transcription Model support #322

feat: Transcription Model support #322

yavens commented Feb 25, 2025

0xMochan left a comment

0xMochan Feb 25, 2025

yavens Feb 25, 2025

yavens Feb 25, 2025

0xMochan Feb 25, 2025 •

edited

Loading

yavens Feb 25, 2025

0xMochan Feb 25, 2025

yavens Feb 25, 2025

-    pub fn load_file(self, path: &str) -> Self {
+    pub fn load_file<P>(self, path: P) -> Self
+    where
+        P: AsRef<path::Path> {

feat: Transcription Model support #322

Are you sure you want to change the base?

feat: Transcription Model support #322

Conversation

yavens commented Feb 25, 2025

0xMochan left a comment

Choose a reason for hiding this comment

0xMochan Feb 25, 2025

Choose a reason for hiding this comment

yavens Feb 25, 2025

Choose a reason for hiding this comment

yavens Feb 25, 2025

Choose a reason for hiding this comment

0xMochan Feb 25, 2025 • edited Loading

Choose a reason for hiding this comment

yavens Feb 25, 2025

Choose a reason for hiding this comment

0xMochan Feb 25, 2025

Choose a reason for hiding this comment

yavens Feb 25, 2025

Choose a reason for hiding this comment

0xMochan Feb 25, 2025 •

edited

Loading