Skip to content

Commit

Permalink
wip proposal: initial commit
Browse files Browse the repository at this point in the history
  • Loading branch information
aconchillo committed May 12, 2024
1 parent 712a889 commit 588137d
Show file tree
Hide file tree
Showing 130 changed files with 5,058 additions and 3,804 deletions.
4 changes: 2 additions & 2 deletions .github/workflows/publish.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
needs: [ build ]
environment:
name: pypi
url: https://pypi.org/p/dailyai
url: https://pypi.org/p/pipecat
permissions:
id-token: write
steps:
Expand All @@ -67,7 +67,7 @@ jobs:
needs: [ build ]
environment:
name: testpypi
url: https://pypi.org/p/dailyai
url: https://pypi.org/p/pipecat
permissions:
id-token: write
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/publish_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ jobs:
needs: [ build ]
environment:
name: testpypi
url: https://pypi.org/p/dailyai
url: https://pypi.org/p/pipecat
permissions:
id-token: write
steps:
Expand Down
28 changes: 12 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,14 @@
[![PyPI](https://img.shields.io/pypi/v/dailyai)](https://pypi.org/project/dailyai)
[![PyPI](https://img.shields.io/pypi/v/pipecat)](https://pypi.org/project/pipecat)

> [!IMPORTANT]
> Hackathon attendees - getting started doc can be found [here](https://dailyco.notion.site/Daily-AI-ff356d3a799649e583fa91c1ccfe0d87)

# dailyai — an open source framework for real-time, multi-modal, conversational AI applications
# Pipecat — an open source framework for voice (and multimodal) assistants

Build things like this:

[![AI-powered voice patient intake for healthcare](https://img.youtube.com/vi/lDevgsp9vn0/0.jpg)](https://www.youtube.com/watch?v=lDevgsp9vn0)

[ [dailyai starter kits repository](https://github.com/daily-co/dailyai-examples) ]
[ [pipecat starter kits repository](https://github.com/daily-co/pipecat-examples) ]

**`dailyai` started as a toolkit for implementing generative AI voice bots.** Things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, and snarky social companions.
**`Pipecat` started as a toolkit for implementing generative AI voice bots.** Things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, and snarky social companions.

In 2023 a _lot_ of us got excited about the possibility of having open-ended conversations with LLMs. It became clear pretty quickly that we were all solving the same [low-level problems](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/):

Expand All @@ -24,7 +20,7 @@ In 2023 a _lot_ of us got excited about the possibility of having open-ended con

As our applications expanded to include additional things like image generation, function calling, and vision models, we started to think about what a complete framework for these kinds of apps could look like.

Today, `dailyai` is:
Today, `pipecat` is:

1. a set of code building blocks for interacting with generative AI services and creating low-latency, interruptible data pipelines that use multiple services
2. transport services that moves audio, video, and events across the Internet
Expand All @@ -49,19 +45,19 @@ Currently implemented services:
- ElevenLabs
- Transport
- Daily
- Local (in progress, intended as a quick start example service)
- Local
- Vision
- Moondream

If you'd like to [implement a service](<(https://github.com/daily-co/daily-ai-sdk/tree/main/src/dailyai/services)>), we welcome PRs! Our goal is to support lots of services in all of the above categories, plus new categories (like real-time video) as they emerge.
If you'd like to [implement a service](<(https://github.com/daily-co/pipecat/tree/main/src/pipecat/services)>), we welcome PRs! Our goal is to support lots of services in all of the above categories, plus new categories (like real-time video) as they emerge.

## Getting started

Today, the easiest way to get started with `dailyai` is to use [Daily](https://www.daily.co/) as your transport service. This toolkit started life as an internal SDK at Daily and millions of minutes of AI conversation have been served using it and its earlier prototype incarnations. (The [transport base class](https://github.com/daily-co/daily-ai-sdk/blob/main/src/dailyai/transports/abstract_transport.py) is easy to extend, though, so feel free to submit PRs if you'd like to implement another transport service.)
Today, the easiest way to get started with `pipecat` is to use [Daily](https://www.daily.co/) as your transport service. This toolkit started life as an internal SDK at Daily and millions of minutes of AI conversation have been served using it and its earlier prototype incarnations.

```
# install the module
pip install dailyai
pip install pipecat
# set up an .env file with API keys
cp dot-env.template .env
Expand All @@ -71,7 +67,7 @@ By default, in order to minimize dependencies, only the basic framework function
dependencies that you can install with:

```
pip install "dailyai[option,...]"
pip install "pipecat[option,...]"
```

Your project may or may not need these, so they're made available as optional requirements. Here is a list:
Expand All @@ -83,8 +79,8 @@ Your project may or may not need these, so they're made available as optional re

There are two directories of examples:

- [foundational](https://github.com/daily-co/daily-ai-sdk/tree/main/examples/foundational)demos that build on each other, introducing one or two concepts at a time
- [starter apps](https://github.com/daily-co/daily-ai-sdk/tree/main/examples/starter-apps) — complete applications that you can use as starting points for development
- [foundational](https://github.com/daily-co/pipecat/tree/main/examples/foundational)examples that build on each other, introducing one or two concepts at a time
- [starter apps](https://github.com/daily-co/pipecat/tree/main/examples/starter-apps) — complete applications that you can use as starting points for development

Before running the examples you need to install the dependencies (which will install all the dependencies to run all of the examples):

Expand Down
12 changes: 6 additions & 6 deletions dev-requirements.txt
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
autopep8==2.0.4
build==1.0.3
pip-tools==7.4.1
pytest==8.1.1
setuptools==69.2.0
setuptools_scm==8.0.4
autopep8~=2.1.0
build~=1.2.1
pip-tools~=7.4.1
pytest~=8.2.0
setuptools~=69.5.1
setuptools_scm~=8.1.0
6 changes: 3 additions & 3 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
# Daily AI SDK Docs
# Pipecat Docs

## [Architecture Overview](architecture.md)

Learn about the thinking behind the SDK's design.
Learn about the thinking behind the framework's design.

## [A Frame's Progress](frame-progress.md)

See how a Frame is processed through a Transport, a Pipeline, and a series of Frame Processors.

## [Example Code](examples/)

The repo includes several example apps in the `examples` directory. The docs explain how they work.
The repository includes several example apps in the `examples` directory. The docs explain how they work.

## [API Reference](api/)

Expand Down
6 changes: 3 additions & 3 deletions docs/architecture.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Daily AI SDK Architecture Guide
# Pipecat architecture guide

## Frames

Expand All @@ -10,8 +10,8 @@ Frame processors operate on frames. Every frame processor implements a `process_

## Pipelines

Pipelines are lists of frame processors that read from a source queue and send the processed frames to a sink queue. A very simple pipeline might chain an LLM frame processor to a text-to-speech frame processor, with a transport's send queue as its sync. Placing LLM message frames on the pipeline's source queue will cause the LLM's response to be spoken. See example #2 for an implementation of this.
Pipelines are lists of frame processors linked together. Frame processors can push frames upstream or downstream to their peers. A very simple pipeline might chain an LLM frame processor to a text-to-speech frame processor, with a transport as an output.

## Transports

Transports provide a receive queue, which is input from "the outside world", and a sink queue, which is data that will be sent "to the outside world". The `LocalTransportService` does this with the local camera, mic, display and speaker. The `DailyTransportService` does this with a WebRTC session joined to a Daily.co room.
Transports provide input and output frame processors to receive or send frames respectively. For example, the `DailyTransport` does this with a WebRTC session joined to a Daily.co room.
46 changes: 24 additions & 22 deletions examples/foundational/01-say-one-thing.py
Original file line number Diff line number Diff line change
@@ -1,53 +1,55 @@
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#

import asyncio
import aiohttp
import logging
import os
from dailyai.pipeline.frames import EndFrame, TextFrame
from dailyai.pipeline.pipeline import Pipeline
import sys

from dailyai.transports.daily_transport import DailyTransport
from dailyai.services.elevenlabs_ai_service import ElevenLabsTTSService
from pipecat.frames.frames import EndFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.task import PipelineTask
from pipecat.pipeline.runner import PipelineRunner
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.services.daily import DailyParams, DailyTransport

from runner import configure

from loguru import logger

from dotenv import load_dotenv
load_dotenv(override=True)

logging.basicConfig(format=f"%(levelno)s %(asctime)s %(message)s")
logger = logging.getLogger("dailyai")
logger.setLevel(logging.DEBUG)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")


async def main(room_url):
async with aiohttp.ClientSession() as session:
transport = DailyTransport(
room_url,
None,
"Say One Thing",
mic_enabled=True,
)
room_url, None, "Say One Thing", DailyParams(audio_out_enabled=True))

tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)

pipeline = Pipeline([tts])
runner = PipelineRunner()

task = PipelineTask(Pipeline([tts, transport.output()]))

# Register an event handler so we can play the audio when the
# participant joins.
@transport.event_handler("on_participant_joined")
async def on_participant_joined(transport, participant):
if participant["info"]["isLocal"]:
return

async def on_new_participant_joined(transport, participant):
participant_name = participant["info"]["userName"] or ''
await pipeline.queue_frames([TextFrame("Hello there, " + participant_name + "!"), EndFrame()])

await transport.run(pipeline)
del tts
await task.queue_frames([TextFrame(f"Hello there, {participant_name}!"), EndFrame()])

await runner.run(task)

if __name__ == "__main__":
(url, token) = configure()
Expand Down
53 changes: 53 additions & 0 deletions examples/foundational/01a-local-audio.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#

import asyncio
import aiohttp
import os
import sys

from pipecat.frames.frames import EndFrame, TextFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.transports.base_transport import TransportParams
from pipecat.transports.local.audio import AudioLocalTransport

from loguru import logger

from dotenv import load_dotenv
load_dotenv(override=True)

logger.remove(0)
logger.add(sys.stderr, level="DEBUG")


async def main():
async with aiohttp.ClientSession() as session:
transport = AudioLocalTransport(TransportParams(audio_out_enabled=True))

tts = ElevenLabsTTSService(
aiohttp_session=session,
api_key=os.getenv("ELEVENLABS_API_KEY"),
voice_id=os.getenv("ELEVENLABS_VOICE_ID"),
)

pipeline = Pipeline([tts, transport.output()])

task = PipelineTask(pipeline)

async def say_something():
await asyncio.sleep(1)
await task.queue_frames([TextFrame("Hello there!"), EndFrame()])

runner = PipelineRunner()

await asyncio.gather(runner.run(task), say_something())


if __name__ == "__main__":
asyncio.run(main())
38 changes: 0 additions & 38 deletions examples/foundational/01a-local-transport.py

This file was deleted.

45 changes: 27 additions & 18 deletions examples/foundational/02-llm-say-one-thing.py
Original file line number Diff line number Diff line change
@@ -1,23 +1,31 @@
import asyncio
import os
import logging
#
# Copyright (c) 2024, Daily
#
# SPDX-License-Identifier: BSD 2-Clause License
#

import asyncio
import aiohttp
import os
import sys

from dailyai.pipeline.frames import EndFrame, LLMMessagesFrame
from dailyai.pipeline.pipeline import Pipeline
from dailyai.transports.daily_transport import DailyTransport
from dailyai.services.elevenlabs_ai_service import ElevenLabsTTSService
from dailyai.services.open_ai_services import OpenAILLMService
from pipecat.frames.frames import EndFrame, LLMMessagesFrame
from pipecat.pipeline.pipeline import Pipeline
from pipecat.pipeline.runner import PipelineRunner
from pipecat.pipeline.task import PipelineTask
from pipecat.services.elevenlabs import ElevenLabsTTSService
from pipecat.services.openai import OpenAILLMService
from pipecat.transports.services.daily import DailyParams, DailyTransport

from runner import configure

from loguru import logger

from dotenv import load_dotenv
load_dotenv(override=True)

logging.basicConfig(format=f"%(levelno)s %(asctime)s %(message)s")
logger = logging.getLogger("dailyai")
logger.setLevel(logging.DEBUG)
logger.remove(0)
logger.add(sys.stderr, level="DEBUG")


async def main(room_url):
Expand All @@ -26,8 +34,7 @@ async def main(room_url):
room_url,
None,
"Say One Thing From an LLM",
mic_enabled=True,
)
DailyParams(audio_out_enabled=True))

tts = ElevenLabsTTSService(
aiohttp_session=session,
Expand All @@ -45,13 +52,15 @@ async def main(room_url):
"content": "You are an LLM in a WebRTC session, and this is a 'hello world' demo. Say hello to the world.",
}]

pipeline = Pipeline([llm, tts])
runner = PipelineRunner()

task = PipelineTask(Pipeline([llm, tts, transport.output()]))

@transport.event_handler("on_first_other_participant_joined")
async def on_first_other_participant_joined(transport, participant):
await pipeline.queue_frames([LLMMessagesFrame(messages), EndFrame()])
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await task.queue_frames([LLMMessagesFrame(messages), EndFrame()])

await transport.run(pipeline)
await runner.run(task)


if __name__ == "__main__":
Expand Down
Loading

0 comments on commit 588137d

Please sign in to comment.