diff --git a/README.md b/README.md
index 331d27b2d..168666aac 100644
--- a/README.md
+++ b/README.md
@@ -1,79 +1,123 @@
-[![PyPI](https://img.shields.io/pypi/v/pipecat-ai)](https://pypi.org/project/pipecat-ai)
+
+
+
-# Pipecat — an open source framework for voice (and multimodal) assistants
+# Pipecat
+
+[![PyPI](https://img.shields.io/pypi/v/pipecat)](https://pypi.org/project/pipecat) [![Discord](https://img.shields.io/discord/1239284677165056021
+)](https://discord.gg/pipecat)
+
+`pipecat` is a framework for building voice (and multimodal) conversational agents. Things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, and snarky social companions.
Build things like this:
[![AI-powered voice patient intake for healthcare](https://img.youtube.com/vi/lDevgsp9vn0/0.jpg)](https://www.youtube.com/watch?v=lDevgsp9vn0)
-[ [pipecat starter kits repository](https://github.com/daily-co/pipecat-examples) ]
+## Getting started with voice agents
+
+You can get started with Pipecat running on your local machine, then move your agent processes to the cloud when you’re ready. You can also add a telephone number, image output, video input, use different LLMs, and more.
+
+```shell
+# install the module
+pip install pipecat-ai
+
+# set up an .env file with API keys
+cp dot-env.template .env
+```
-**`Pipecat` started as a toolkit for implementing generative AI voice bots.** Things like personal coaches, meeting assistants, story-telling toys for kids, customer support bots, and snarky social companions.
+By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional dependencies that you can install with:
-In 2023 a _lot_ of us got excited about the possibility of having open-ended conversations with LLMs. It became clear pretty quickly that we were all solving the same [low-level problems](https://www.daily.co/blog/how-to-talk-to-an-llm-with-your-voice/):
+```shell
+pip install "pipecat-ai[option,...]"
+```
-- low-latency, reliable audio transport
-- echo cancellation
-- phrase endpointing (knowing when the bot should respond to human speech)
-- interruptibility
-- writing clean code to stream data through "pipelines" of speech-to-text, LLM inference, and text-to-speech models
+Your project may or may not need these, so they're made available as optional requirements. Here is a list:
-As our applications expanded to include additional things like image generation, function calling, and vision models, we started to think about what a complete framework for these kinds of apps could look like.
+- **AI services**: `anthropic`, `azure`, `fal`, `moondream`, `openai`, `playht`, `silero`, `whisper`
+- **Transports**: `daily`, `local`, `websocket`
-Today, `pipecat` is:
+## A simple voice agent running locally
-1. a set of code building blocks for interacting with generative AI services and creating low-latency, interruptible data pipelines that use multiple services
-2. transport services that moves audio, video, and events across the Internet
-3. implementations of specific generative AI services
+If you’re doing AI-related stuff, you probably have an OpenAI API key.
-Currently implemented services:
+To generate voice output, one service that’s easy to get started with is ElevenLabs. If you don’t already have an ElevenLabs developer account, you can sign up for one [here].
-- Speech-to-text
- - Deepgram
- - Whisper
-- LLMs
- - Azure
- - Fireworks
- - OpenAI
-- Image generation
- - Azure
- - Fal
- - OpenAI
-- Text-to-speech
- - Azure
- - Deepgram
- - ElevenLabs
-- Transport
- - Daily
- - Local
-- Vision
- - Moondream
+So let’s run a really simple agent that’s just a GPT-4 prompt, wired up to voice input and speaker output.
-If you'd like to [implement a service](<(https://github.com/daily-co/pipecat/tree/main/src/pipecat/services)>), we welcome PRs! Our goal is to support lots of services in all of the above categories, plus new categories (like real-time video) as they emerge.
+You can change the prompt, in the code. The current prompt is “Tell me something interesting about the Roman Empire.”
-## Getting started
+`cd examples/getting-started` to run the following examples …
-Today, the easiest way to get started with `pipecat` is to use [Daily](https://www.daily.co/) as your transport service. This toolkit started life as an internal SDK at Daily and millions of minutes of AI conversation have been served using it and its earlier prototype incarnations.
+```shell
+# Talk to a local pipecat process with your voice. Specify GPT-4 as the LLM.
+export OPENAI_API_KEY=...
+export ELEVENLABS_API_KEY=...
+python ./local-mic.py | ./pipecat-pipes-gpt-4.py | ./local-speaker.py
```
-# install the module
-pip install pipecat
-# set up an .env file with API keys
-cp dot-env.template .env
-```
+## WebSockets instead of pipes
+
+To run your agent in the cloud, you can switch the Pipecat transport layer to use a WebSocket instead of Unix pipes.
-By default, in order to minimize dependencies, only the basic framework functionality is available. Some third-party AI services require additional
-dependencies that you can install with:
+```shell
+# Talk to a local pipecat process with your voice. Specify GPT-4 as the LLM.
+export OPENAI_API_KEY=...
+export ELEVENLABS_API_KEY=...
+python ./local-mic-and-speaker-wss.py wss://localhost:8088
```
-pip install "pipecat[option,...]"
+
+## WebRTC for production use
+
+WebSockets are fine for server-to-server communication or for initial development. But for production use, you’ll need client-server audio to use a protocol designed for real-time media transport. (For an explanation of the difference between WebSockets and WebRTC, see [this post.])
+
+One way to get up and running quickly with WebRTC is to sign up for a Daily developer account. Daily gives you SDKs and global infrastructure for audio (and video) routing. Every account gets 10,000 audio/video/transcription minutes free each month.
+
+Sign up [here](https://dashboard.daily.co/u/signup) and [create a room](https://docs.daily.co/reference/rest-api/rooms) in the developer Dashboard. Then run the examples, this time connecting via WebRTC instead of a WebSocket.
+
+```shell
+# 1. Run the pipecat process. Provide your Daily API key and a Daily room
+export DAILY_API_KEY=...
+export OPENAI_API_KEY=...
+export ELEVENLABS_API_KEY=...
+python pipecat-daily-gpt-4.py --daily-room https://example.daily.co/pipecat
+
+# 2. Visit the Daily room link in any web browser to talk to the pipecat process.
+# You'll want to use a Daily SDK to embed the client-side code into your own
+# app. But visiting the room URL in a browser is a quick way to start building
+# agents because you can focus on just the agent code at first.
+open -a "Google Chrome" https://example.daily.co/pipecat
```
-Your project may or may not need these, so they're made available as optional requirements. Here is a list:
+## Deploy your agent to the cloud
+Now that you’ve decoupled client and server, and have a Pipecat process that can run anywhere you can run Python, you can deploy this example agent to the cloud.
+
+`TBC`
+
+## Taking it further
+
+### Add a telephone number
+Daily supports telephone connections in addition to WebRTC streams. You can add a telephone number to your Daily room with the following REST API call. Once you’ve done that, you can call your agent on the phone.
+
+You’ll need to add a credit card to your Daily account to enable telephone numbers.
+
+`TBC`
+
+
+### Add image output
+
+Daily supports telephone connections in addition to WebRTC streams. You can add a telephone number to your Daily room with the following REST API call. Once you’ve done that, you can call your agent on the phone.
+
+You’ll need to add a credit card to your Daily account to enable telephone numbers.
+
+`TBC`
+
+### Add video output
+
+
+`TBC`
-- **AI services**: `anthropic`, `azure`, `fal`, `moondream`, `openai`, `playht`, `silero`, `whisper`
-- **Transports**: `daily`, `local`, `websocket`
## Code examples
diff --git a/pipecat.png b/pipecat.png
new file mode 100644
index 000000000..912360f2c
Binary files /dev/null and b/pipecat.png differ
diff --git a/pyproject.toml b/pyproject.toml
index 49ac82492..33d61424e 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -33,12 +33,12 @@ Website = "https://pipecat.ai"
[project.optional-dependencies]
anthropic = [ "anthropic~=0.25.7" ]
-audio = [ "pyaudio~=0.2.0" ]
azure = [ "azure-cognitiveservices-speech~=1.37.0" ]
daily = [ "daily-python~=0.7.4" ]
examples = [ "python-dotenv~=1.0.0", "flask~=3.0.3", "flask_cors~=4.0.1" ]
fal = [ "fal-client~=0.4.0" ]
fireworks = [ "openai~=1.26.0" ]
+local = [ "pyaudio~=0.2.0" ]
moondream = [ "einops~=0.8.0", "timm~=0.9.16", "transformers~=4.40.2" ]
openai = [ "openai~=1.26.0" ]
playht = [ "pyht~=0.0.28" ]
diff --git a/src/pipecat/services/anthropic.py b/src/pipecat/services/anthropic.py
index 8632fdaf1..25620f783 100644
--- a/src/pipecat/services/anthropic.py
+++ b/src/pipecat/services/anthropic.py
@@ -15,7 +15,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use Anthropic, you need to `pip install pipecat[anthropic]`. Also, set `ANTHROPIC_API_KEY` environment variable.")
+ "In order to use Anthropic, you need to `pip install pipecat-ai[anthropic]`. Also, set `ANTHROPIC_API_KEY` environment variable.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/services/azure.py b/src/pipecat/services/azure.py
index d56058821..596e6726d 100644
--- a/src/pipecat/services/azure.py
+++ b/src/pipecat/services/azure.py
@@ -21,7 +21,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use Azure TTS, you need to `pip install pipecat[azure]`. Also, set `AZURE_SPEECH_API_KEY` and `AZURE_SPEECH_REGION` environment variables.")
+ "In order to use Azure TTS, you need to `pip install pipecat-ai[azure]`. Also, set `AZURE_SPEECH_API_KEY` and `AZURE_SPEECH_REGION` environment variables.")
raise Exception(f"Missing module: {e}")
from pipecat.services.openai_api_llm_service import BaseOpenAILLMService
diff --git a/src/pipecat/services/fal.py b/src/pipecat/services/fal.py
index 1049b4428..ca58f0337 100644
--- a/src/pipecat/services/fal.py
+++ b/src/pipecat/services/fal.py
@@ -23,7 +23,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use Fal, you need to `pip install pipecat[fal]`. Also, set `FAL_KEY` environment variable.")
+ "In order to use Fal, you need to `pip install pipecat-ai[fal]`. Also, set `FAL_KEY` environment variable.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/services/fireworks.py b/src/pipecat/services/fireworks.py
index 402384d0d..6d2d44e6c 100644
--- a/src/pipecat/services/fireworks.py
+++ b/src/pipecat/services/fireworks.py
@@ -13,7 +13,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use Fireworks, you need to `pip install pipecat[fireworks]`. Also, set the `FIREWORKS_API_KEY` environment variable.")
+ "In order to use Fireworks, you need to `pip install pipecat-ai[fireworks]`. Also, set the `FIREWORKS_API_KEY` environment variable.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/services/moondream.py b/src/pipecat/services/moondream.py
index f74ba828b..e069c98ed 100644
--- a/src/pipecat/services/moondream.py
+++ b/src/pipecat/services/moondream.py
@@ -19,7 +19,7 @@
from transformers import AutoModelForCausalLM, AutoTokenizer
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
- logger.error("In order to use Moondream, you need to `pip install pipecat[moondream]`.")
+ logger.error("In order to use Moondream, you need to `pip install pipecat-ai[moondream]`.")
raise Exception(f"Missing module(s): {e}")
diff --git a/src/pipecat/services/openai.py b/src/pipecat/services/openai.py
index b15d7950b..50b8b1478 100644
--- a/src/pipecat/services/openai.py
+++ b/src/pipecat/services/openai.py
@@ -32,7 +32,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use OpenAI, you need to `pip install pipecat[openai]`. Also, set `OPENAI_API_KEY` environment variable.")
+ "In order to use OpenAI, you need to `pip install pipecat-ai[openai]`. Also, set `OPENAI_API_KEY` environment variable.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/services/playht.py b/src/pipecat/services/playht.py
index 69c7bac9d..b2aa4e198 100644
--- a/src/pipecat/services/playht.py
+++ b/src/pipecat/services/playht.py
@@ -19,7 +19,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use PlayHT, you need to `pip install pipecat[playht]`. Also, set `PLAY_HT_USER_ID` and `PLAY_HT_API_KEY` environment variables.")
+ "In order to use PlayHT, you need to `pip install pipecat-ai[playht]`. Also, set `PLAY_HT_USER_ID` and `PLAY_HT_API_KEY` environment variables.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/services/whisper.py b/src/pipecat/services/whisper.py
index 768e689c8..e0d14e903 100644
--- a/src/pipecat/services/whisper.py
+++ b/src/pipecat/services/whisper.py
@@ -22,7 +22,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use Whisper, you need to `pip install pipecat[whisper]`.")
+ "In order to use Whisper, you need to `pip install pipecat-ai[whisper]`.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/transports/local/audio.py b/src/pipecat/transports/local/audio.py
index 32266444f..c0038d250 100644
--- a/src/pipecat/transports/local/audio.py
+++ b/src/pipecat/transports/local/audio.py
@@ -18,7 +18,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use local audio, you need to `pip install pipecat[audio]`. On MacOS, you also need to `brew install portaudio`.")
+ "In order to use local audio, you need to `pip install pipecat-ai[local]`. On MacOS, you also need to `brew install portaudio`.")
raise Exception(f"Missing module: {e}")
diff --git a/src/pipecat/transports/local/tk.py b/src/pipecat/transports/local/tk.py
index 6a05c9a63..3d5ea9650 100644
--- a/src/pipecat/transports/local/tk.py
+++ b/src/pipecat/transports/local/tk.py
@@ -22,7 +22,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
logger.error(
- "In order to use local audio, you need to `pip install pipecat[audio]`. On MacOS, you also need to `brew install portaudio`.")
+ "In order to use local audio, you need to `pip install pipecat-ai[audio]`. On MacOS, you also need to `brew install portaudio`.")
raise Exception(f"Missing module: {e}")
try:
diff --git a/src/pipecat/transports/services/daily.py b/src/pipecat/transports/services/daily.py
index 84d690569..0343c53d8 100644
--- a/src/pipecat/transports/services/daily.py
+++ b/src/pipecat/transports/services/daily.py
@@ -44,7 +44,8 @@
from daily import (EventHandler, CallClient, Daily)
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
- logger.error("In order to use the Daily transport, you need to `pip install pipecat[daily]`.")
+ logger.error(
+ "In order to use the Daily transport, you need to `pip install pipecat-ai[daily]`.")
raise Exception(f"Missing module: {e}")
VAD_RESET_PERIOD_MS = 2000
diff --git a/src/pipecat/vad/silero.py b/src/pipecat/vad/silero.py
index a9e5aa0ed..f2438b085 100644
--- a/src/pipecat/vad/silero.py
+++ b/src/pipecat/vad/silero.py
@@ -22,7 +22,7 @@
except ModuleNotFoundError as e:
logger.error(f"Exception: {e}")
- logger.error("In order to use Silero VAD, you need to `pip install pipecat[silero]`.")
+ logger.error("In order to use Silero VAD, you need to `pip install pipecat-ai[silero]`.")
raise Exception(f"Missing module(s): {e}")