-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further consolidate service update settings into a single ServiceUpdateSettingsFrame class #527
Conversation
src/pipecat/frames/frames.py
Outdated
@@ -558,6 +558,8 @@ class TTSUpdateSettingsFrame(ControlFrame): | |||
style: Optional[str] = None | |||
style_degree: Optional[str] = None | |||
role: Optional[str] = None | |||
gender: Optional[str] = None | |||
google_style: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if this frame should just be:
@dataclass
class TTSUpdateSettingsFrame(ControlFrame):
settings: Mapping[str, Any]
and then use "gender"
, etc. where needed. Because we might have tons of other settings and by just looking at this frame we have no idea which setting corresponds to each service.
If we do that, maybe we can just have:
@dataclass
class ServiceUpdateSettingsFrame(ControlFrame):
settings: Mapping[str, Any]
which is generic for all services. No need for LLMUpdateSettingsFrame
or TTSUpdateSettingsFrame
.
I know we lose completions and types but another benefit is that these won't have to change at all and people writing their own services (and not sharing them) they can keep using the frame. Actually, the same idea with extra
, but we make all extra
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we did that, it could be more like the metrics and we could have defined types for each of the various known services. that said, this would be a pretty major breaking change, yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i take that back. the metrics aren't really service specific. i mis-remembered. but there could still be some typing there eventually. maybe start with Any for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i take it back again, we could still have different types defined for llm vs tts vs stt. I'm not sure where the google specific goes, though. Is there a way google.py could expand? or maybe instead of having all settings by [str, Any], there's a custom_settings
that maps to any?
// in frames.py
@dataclass
class UpdateSettingsFrame(ControlFrame):
settings: SettingsUpdate
// new settingsUpdates.py ? (name tbd)
class SettingsUpdate(BaseModel):
model: Optional[str] = None
custom: Optional[Mapping[str, Any]]
class LLMSettingsUpdate(SettingsUpdate):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw. working it through with mark, i realized that since these frames are created in client code, the typing described above didn't really make sense.
src/pipecat/frames/frames.py
Outdated
gender: Optional[str] = None | ||
google_style: Optional[str] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not a fan of having such custom service settings in the base class, but given the current way things are setup, it seems like this is the best option. we should consider if there's a better way to handle custom setting updates in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it's the same comment I posted. Would my proposal work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i think we were reviewing at the same time. I like your proposal and commented on how it might be fleshed out a bit more to maintain some typing.
226ad77
to
f98e3ca
Compare
src/pipecat/services/anthropic.py
Outdated
async def _update_settings(self, settings: Dict[str, Any]): | ||
for key, value in settings.items(): | ||
setter = getattr(self, f"set_{key}", None) | ||
if setter and callable(setter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if it's worth it to have a self._settings: Dict[str, Any]
in the class instead if one field per setting.
Then you could just do:
for key, value in settings.items():
if key in settings:
self._settings[key] = value
else:
logger.warning(f"Unknown setting for Anthropic LLM service: {key}")
and in the constructor we initialize self._settings = { "key": value, .... }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just that
setter = getattr(self, f"set_{key}", None)
if setter and callable(setter):
feels a bit wrong.
08d0738
to
91dfc0b
Compare
91dfc0b
to
9713d75
Compare
await self._disconnect() | ||
await self._connect() | ||
|
||
async def set_voice_settings( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how would this function have been used before? i don't see any usage of it in the codebase and i'm trying to think through if removing this custom-looking handling of the settings breaks things
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I misunderstood how this could be used. It wasn't accessible before! So, it should be safe to remove.
if self._websocket: | ||
msg = {"voice_settings": self._voice_settings} | ||
await self._websocket.send(json.dumps(msg)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to expand on above... maybe this needs to override _update_settings()
to be something like:
async def _update_settings(self, settings: Dict[str, Any]):
super()._update_settings(settings)
self._voice_settings = self._set_voice_settings()
if self._websocket:
msg = {"voice_settings": self._voice_settings}
await self._websocket.send(json.dumps(msg))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apparently this function was not needed to begin with and should be removed and functionality not preserved, so this can be ignored... but not the one above about set_voice
@@ -167,34 +180,6 @@ async def set_model(self, model: str): | |||
await self._disconnect() | |||
await self._connect() | |||
|
|||
async def set_voice(self, voice: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here. this has custom handling that's more than just setting the voice.... updating my theoretical code below to include a potential way to handle it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nevermind. that one can be ignored... i think this needs to be added to custom handle voice changes though:
async def _update_settings(self, settings: Dict[str, Any]):
prev_voice = self._settings["voice"]
super()._update_settings(settings)
if not prev_voice == self._settings["voice"]:
await self._disconnect()
await self._connect()
TBH. The fact that this behavior also happens when the model changes, I'm wondering if elevenlabs requires a disconnect/connect to apply ANY changes 🤔
async def _update_settings(self, settings: Dict[str, Any]):
prev_settings = self._settings
super()._update_settings(settings)
if not prev_settings == self._settings:
await self._disconnect()
await self._connect()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'll leave further review to aleix. but to me, other than the couple of places that need custom handling I pointed out, this looks like good cleanup!
LGTM! |
e815da8
to
c8b17c5
Compare
c8b17c5
to
406318e
Compare
406318e
to
28643b4
Compare
7527a26
to
d75a02d
Compare
LGTM! |
Hijacked this PR to make the change to simplify the service update settings. There's not a base class for service updates called
ServiceUpdateSettingsFrame
. There are subclasses for different service types:LLMUpdateSettingsFrame
,TTSUpdateSettingsFrame
,STTUpdateSettingsFrame
.When these frames are processed, there's an
_update_settings
handler in ai_services.py that is responsible for facilitating the update. For most values, this is a key lookup. There is special case handling formodel
andvoice
needed to ensure keys are mapped correctly.For each service that has InputParams, those params are part of a
_settings
object which is updated by the_update_settings
handlers.This change simplifies maintaining settings updates for services. Going forward, the keys are to:
voice_id
andmodel
are used uniformly for services. Previously, the same change was made forlanguage
._update_settings
method will do the trick unless special handling is required. You can see an example of that for Cartesia'sspeed
andemotion
params.Lastly, in this PR, I'm also applying the Language type across all services. Includes must be from the Language enum. Language strings are translated to the correct format for each service before applying them.