-
Notifications
You must be signed in to change notification settings - Fork 435
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add background_noise service and example #536
Conversation
@@ -0,0 +1,120 @@ | |||
import asyncio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we move the file to processors/audio
please? Also, instead of background_noise.py
it could be called background_audio.py
since noise really has a negative connotation.
|
||
|
||
class BackgroundNoiseEffect(FrameProcessor): | ||
def __init__(self, websocket_client, stream_sid, music_path): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should not pass a websocket
or stream_sid
. This class should ideally be generic so it works for every use case. I'll give some suggestions how below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of music_path
should we pass an AudioSegment
instead?
if not self.emptied: | ||
self.emptied = True | ||
buffer_clear_message = {"event": "clear", "streamSid": self.stream_sid} | ||
await self.websocket_client.send_text(json.dumps(buffer_clear_message)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we could use the new processor event handlers. for example:
self.call_event_handler("on_empty_audio")
then, from the pipeline you would do:
transport = FastAPIWebsocketTransport()
background_audio = BackgroundAudio("your_path_to_audio_in_format_pcm16000")
@background_audio.event_handler("on_empty_audio", background_audio):
transport.clear_something()
while True: | ||
await sleep(0.005) | ||
if self._stop: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use trask cancelling instead of having a _stop
variable.
frame.audio = self._combine_with_music(frame) | ||
|
||
if isinstance(frame, EndFrame): | ||
self._stop = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here we could cancel the task. you can find many examples in other files.
import loguru | ||
|
||
from pipecat.processors.frame_processor import FrameProcessor, FrameDirection | ||
from pydub import AudioSegment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is introducing a new dependency pydub
we should add that pyproject.toml and make sure we advise the use to do pip install pipecat[pydub]
|
||
tma_in = LLMUserResponseAggregator(messages) | ||
tma_out = LLMAssistantResponseAggregator(messages) | ||
background_noise = BackgroundNoiseEffect(websocket_client, stream_sid, "your_path_to_audio_in_format_pcm16000") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems, this could actually be in file supported by pydub (mp3, ogg, mp4...)
Generator that yields chunks of background music audio. | ||
""" | ||
music_audio = AudioSegment.from_wav(self.music_path) | ||
music_audio = music_audio - 15 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is - 15
?
output_buffer = BytesIO() | ||
try: | ||
music_chunk.export(output_buffer, format="raw") | ||
frame = OutputAudioRawFrame(audio=output_buffer.getvalue(), sample_rate=16000, num_channels=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample rate should probably be configurable from the constructor. And probably, to make it easy for everyone we should probably resample. This way we can pass any audio file.
Combines small raw audio segments from the frame with chunks of a music file. | ||
""" | ||
small_audio_bytes = frame.audio | ||
music_audio = AudioSegment.from_wav(self.music_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to load the audio segment here since we could now pass it in the construtor. This way we can use from_mp3
for example.
small_audio = AudioSegment( | ||
data=small_audio_bytes, | ||
sample_width=2, | ||
frame_rate=16000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample rate should come from constructor.
Thank you @Viking5274!!! and apologies for the delay. I added some comments to try to make this super cool feature a bit more generic. |
This is superseded by #682. I ended up doing it differently and using other libraries, but thank you @Viking5274 for working on this in the first place! |
Btw, I kept your original PR to appreciate the work you did! |
added background_noise service and example
example now works only with twilio with delays, and need to add pydub==0.25.1 to requirements
@aconchillo