Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LiveKit audio transport #467

Merged
merged 7 commits into from
Sep 27, 2024
Merged

Conversation

joachimchauvet
Copy link
Contributor

@joachimchauvet joachimchauvet commented Sep 17, 2024

This might not be perfect yet but I would love to get some initial feedback.
It looks like there are quite a few people interested #325

from livekit import rtc
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error("In order to use LiveKit, you need to `pip install pipecat-ai[livekit]`.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably want to add livekit to pyproject.toml with the right dependencies.

@aconchillo
Copy link
Contributor

This looks really good! I'll take a final look later, thank you!!!

@cyrilS-dev
Copy link
Contributor

This looks amazing @joachimchauvet ! I'm particularly interested in this PR as I'm using LiveKit.
I will give it a try as soon as possible and provide feedback. Looking forward to testing it out!
Many thanks

@cyrilS-dev
Copy link
Contributor

@joachimchauvet I just tested and I can't play the audio when I join the room using the LK playground. I join the same room as my agent, I have both participants active in the room, the room events are consistent, but the 'Agent connected' status on the playground remains pending, as does the audio track.

@joachimchauvet
Copy link
Contributor Author

@joachimchauvet I just tested and I can't play the audio when I join the room using the LK playground. I join the same room as my agent, I have both participants active in the room, the room events are consistent, but the 'Agent connected' status on the playground remains pending, as does the audio track.

It seems to work fine on my side.
However, I noticed that I forgot to adjust the sample rate in https://github.com/joachimchauvet/pipecat-livekit/blob/main/examples/foundational/01b-livekit-audio.py
Could it be that the voice was playing so fast that you did not really hear it if you were using that example?
Are you using a different token for the bot and the user?

@aconchillo
Copy link
Contributor

This looks great! Please, rebase and make sure the pipeline is green. We are now using Ruff as our formatter. Make sure everything is green and I think we cna merge after that. Thank you! 🙏

@cyrilS-dev
Copy link
Contributor

cyrilS-dev commented Sep 23, 2024

@joachimchauvet I encountered an error while using the LiveKitInputTransport in my application :

Error in audio input task: 'LiveKitInputTransport' object has no attribute '_internal_push_frame'

_internal_push_frame method is not defined in the LiveKitInputTransport class

@joachimchauvet
Copy link
Contributor Author

@joachimchauvet I encountered an error while using the LiveKitInputTransport in my application :

Error in audio input task: 'LiveKitInputTransport' object has no attribute '_internal_push_frame'

_internal_push_frame method is not defined in the LiveKitInputTransport class

Were you doing something special or made changes to the LiveKitInputTransport? It inherits the BaseInputTransport so _internal_push_frame should be defined.
The method works fine on my side 🤔

@cyrilS-dev
Copy link
Contributor

cyrilS-dev commented Sep 24, 2024

@joachimchauvet I encountered an error while using the LiveKitInputTransport in my application :
Error in audio input task: 'LiveKitInputTransport' object has no attribute '_internal_push_frame'
_internal_push_frame method is not defined in the LiveKitInputTransport class

Were you doing something special or made changes to the LiveKitInputTransport? It inherits the BaseInputTransport so _internal_push_frame should be defined. The method works fine on my side 🤔

There is no _internal_push_frame method into BaseInputTransport since #436

@cyrilS-dev
Copy link
Contributor

@joachimchauvet I encountered an error in the _audio_in_task_handler method of the LiveKitInputTransport class., using Deepgram as STT. :

AttributeError: 'AsyncListenWebSocketClient' object has no attribute '_socket'
2024-09-25 15:49:15.863 | ERROR    | pipecat.processors.frame_processor:push_frame:203 - Uncaught exception in LiveKitInputTransport#0: 'AsyncListenWebSocketClient' object has no attribute '_socket'

The issue arises when await self.push_frame(pipecat_audio_frame) attempts to push frames before the Deepgram WebSocket connection is fully initialized.

@aconchillo
Copy link
Contributor

@joachimchauvet I encountered an error in the _audio_in_task_handler method of the LiveKitInputTransport class., using Deepgram as STT. :

AttributeError: 'AsyncListenWebSocketClient' object has no attribute '_socket'
2024-09-25 15:49:15.863 | ERROR    | pipecat.processors.frame_processor:push_frame:203 - Uncaught exception in LiveKitInputTransport#0: 'AsyncListenWebSocketClient' object has no attribute '_socket'

The issue arises when await self.push_frame(pipecat_audio_frame) attempts to push frames before the Deepgram WebSocket connection is fully initialized.

In theory, all processors should receive the StartFrame first (this is guaranteed now in main) and they should make sure any connection or whatever is needed is established so it's safe to push frames after.

It's possible that this branch hasn't been rebased so it doesn't have that StartFrame guarantee. Would it be possible for you to rebase locally and try that? I could be wrong and maybe it's something else...

@cyrilS-dev
Copy link
Contributor

cyrilS-dev commented Sep 25, 2024

@joachimchauvet I encountered an error in the _audio_in_task_handler method of the LiveKitInputTransport class., using Deepgram as STT. :

AttributeError: 'AsyncListenWebSocketClient' object has no attribute '_socket'
2024-09-25 15:49:15.863 | ERROR    | pipecat.processors.frame_processor:push_frame:203 - Uncaught exception in LiveKitInputTransport#0: 'AsyncListenWebSocketClient' object has no attribute '_socket'

The issue arises when await self.push_frame(pipecat_audio_frame) attempts to push frames before the Deepgram WebSocket connection is fully initialized.

In theory, all processors should receive the StartFrame first (this is guaranteed now in main) and they should make sure any connection or whatever is needed is established so it's safe to push frames after.

It's possible that this branch hasn't been rebased so it doesn't have that StartFrame guarantee. Would it be possible for you to rebase locally and try that? I could be wrong and maybe it's something else...

You're absolutely right @aconchillo. Thanks for pointing that out!

Now, regarding this part:

async def send_metrics(self, frame: MetricsFrame):
        metrics = {}
        if frame.ttfb:
            metrics["ttfb"] = frame.ttfb
        if frame.processing:
            metrics["processing"] = frame.processing
        if hasattr(frame, "tokens"):
            metrics["tokens"] = frame.tokens
        if hasattr(frame, "characters"):
            metrics["characters"] = frame.characters

        message = LiveKitTransportMessageFrame(message={"type": "pipecat-metrics", "metrics": metrics})
        await self._client.send_data(str(message.message).encode())

This needs to be updated due to changes introduced in #474

@cyrilS-dev
Copy link
Contributor

Everything is running perfectly; just one detail: LiveKitInputTransport stops when the room is disconnected, but LiveKitOutputTransport doesn't; this triggers an error after a while from the TTS service: no close frame received or sent.

@aconchillo
Copy link
Contributor

Everything is running perfectly; just one detail: LiveKitInputTransport stops when the room is disconnected, but LiveKitOutputTransport doesn't; this triggers an error after a while from the TTS service: no close frame received or sent.

I'll wait on @joachimchauvet to answer this before merging.

@aconchillo
Copy link
Contributor

I'll go ahead and merge this one. If there's any error you can provide a fix later. Thank you!!!

@aconchillo aconchillo merged commit 4501dca into pipecat-ai:main Sep 27, 2024
3 checks passed
@joachimchauvet
Copy link
Contributor Author

Everything is running perfectly; just one detail: LiveKitInputTransport stops when the room is disconnected, but LiveKitOutputTransport doesn't; this triggers an error after a while from the TTS service: no close frame received or sent.

I don't have this error with my agents. I have an idea where that might come from but I was not able to reproduce that no close frame received or sent directly. @cyrilS-dev could you share a pipeline that triggers this error?

@cyrilS-dev
Copy link
Contributor

cyrilS-dev commented Sep 28, 2024

@joachimchauvet The execution is blocked when calling await super().stop(frame) in the stop method of the LiveKitOutputTransport class.

The stop method in the BaseOutputTransport class is causing the execution to hang due to :

if self._sink_task:
     await self._sink_task
if self._sink_clock_task:
     await self._sink_clock_task

As a result, the await self._client.disconnect() call is never reached.

@aconchillo do you have any insights on this? Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants