Merge pull request #68 from pipecat-ai/mb/system-messages-rework

Update FlowManager to support role and task messages
pipecat-ai · Dec 20, 2024 · a00b0b8 · a00b0b8
2 parents 81387b6 + b6e856e
commit a00b0b8
Show file tree

Hide file tree

Showing 30 changed files with 920 additions and 652 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,16 +7,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
-### Added
+### Changed
 
-- New `initial_system_message` field in `FlowConfig`, which allows setting a
-  global system message for static flows.
+- Nodes now have two message types to better delineate defining the role or
+  persona of the bot from the task it needs to accomplish. The message types are:
 
-### Changed
+  - `role_messages`, which defines the personality or role of the bot
+  - `task_messages`, which defines the task to be completed for a given node
+
+- `role_messages` can be defined for the initial node and then inherited by
+  subsequent nodes. You can treat this as an LLM "system" message.
 
 - Simplified FlowManager initialization by removing the need for manual context
-  setup in static flows.
-- Updated static examples to use the updated API.
+  setup in both static and dynamic flows. Now, you need to create a `FlowManager`
+  and initialize it to start the flow.
+- All examples have been updated to align with the API changes.
 
 ## [0.0.9] - 2024-12-08
 

diff --git a/README.md b/README.md
@@ -38,26 +38,18 @@ pip install "pipecat-ai[daily,google,deepgram,cartesia]"     # For Google
 
 ## Quick Start
 
-Here's a basic example of setting up a conversation flow:
+Here's a basic example of setting up a static conversation flow:
 
 ```python
 from pipecat_flows import FlowManager
 
 # Initialize flow manager with static configuration
 flow_manager = FlowManager(task, llm, tts, flow_config=flow_config)
 
-# Or with dynamic flow handling
-flow_manager = FlowManager(
-    task,
-    llm,
-    tts,
-    transition_callback=handle_transitions
-)
-
 @transport.event_handler("on_first_participant_joined")
 async def on_first_participant_joined(transport, participant):
     await transport.capture_participant_transcription(participant["id"])
-    await flow_manager.initialize(messages)
+    await flow_manager.initialize()
     await task.queue_frames([context_aggregator.user().get_context_frame()])
 ```
 
@@ -71,17 +63,32 @@ Each conversation flow consists of nodes that define the conversation structure.
 
 #### Messages
 
-Messages set the context for the LLM at each state:
+Nodes use two types of messages to control the conversation:
+
+1. **Role Messages**: Define the bot's personality or role (optional)
 
 ```python
-"messages": [
+"role_messages": [
     {
         "role": "system",
-        "content": "You are handling pizza orders. Ask for size selection."
+        "content": "You are a friendly pizza ordering assistant. Keep responses casual and upbeat."
     }
 ]
 ```
 
+2. **Task Messages**: Define what the bot should do in the current node
+
+```python
+"task_messages": [
+    {
+        "role": "system",
+        "content": "Ask the customer which pizza size they'd like: small, medium, or large."
+    }
+]
+```
+
+Role messages are typically defined in your initial node and inherited by subsequent nodes, while task messages are specific to each node's purpose.
+
 #### Functions
 
 Functions come in two types:
@@ -101,7 +108,6 @@ Functions come in two types:
                 "size": {"type": "string", "enum": ["small", "medium", "large"]}
             }
         },
-        "transition_to": "next_node"  # Optional: Specify next node
     }
 }
 ```
@@ -113,6 +119,7 @@ Functions come in two types:
     "type": "function",
     "function": {
         "name": "next_step",
+        "handler": select_size_handler, # Optional handler
         "description": "Move to next state",
         "parameters": {"type": "object", "properties": {}},
         "transition_to": "target_node"  # Required: Specify target node
@@ -129,17 +136,27 @@ Functions can:
 
 #### Actions
 
-Actions execute during state transitions:
+There are two types of actions available:
+
+- `pre_actions`: Run before the LLM inference. For long function calls, you can use a pre_action for the TTS to say something, like "Hold on a moment..."
+- `post_actions`: Run after the LLM inference. This is handy for actions like ending or transferring a call.
 
 ```python
 "pre_actions": [
     {
         "type": "tts_say",
         "text": "Processing your order..."
     }
+],
+"post_actions": [
+    {
+        "type": "end_conversation"
+    }
 ]
 ```
 
+Learn more about built-in actions and defining your own action in the docs.
+
 #### Provider-Specific Formats
 
 Pipecat Flows automatically handles format differences between LLM providers:
@@ -189,15 +206,15 @@ The FlowManager handles both static and dynamic flows through a unified interfac
 # Define flow configuration upfront
 flow_config = {
     "initial_node": "greeting",
-    "initial_system_message": [
-        {
-            "role": "system",
-            "content": "You are a helpful assistant. Your responses will be converted to audio."
-        }
-    ],
     "nodes": {
         "greeting": {
-            "messages": [
+            "role_messages": [
+                {
+                    "role": "system",
+                    "content": "You are a helpful assistant. Your responses will be converted to audio."
+                }
+            ],
+            "task_messages": [
                 {
                     "role": "system",
                     "content": "Start by greeting the user and asking for their name."
@@ -225,16 +242,49 @@ await flow_manager.initialize()
 #### Dynamic Flows
 
 ```python
-# Define transition handling
-async def handle_transitions(function_name: str, args: Dict, flow_manager):
-    if function_name == "collect_age":
-        await flow_manager.set_node("next_step", create_next_node())
-
-system_message = "You are an assistant."
+def create_initial_node() -> NodeConfig:
+    return {
+        "role_messages": [
+            {
+                "role": "system",
+                "content": "You are a helpful assistant."
+            }
+        ],
+        "task_messages": [
+            {
+                "role": "system",
+                "content": "Ask the user for their age."
+            }
+        ],
+        "functions": [
+            {
+                "type": "function",
+                "function": {
+                    "name": "collect_age",
+                    "handler": collect_age,
+                    "description": "Record user's age",
+                    "parameters": {
+                        "type": "object",
+                        "properties": {
+                            "age": {"type": "integer"}
+                        },
+                        "required": ["age"]
+                    }
+                }
+            }
+        ]
+    }
 
 # Initialize with transition callback
 flow_manager = FlowManager(task, llm, tts, transition_callback=handle_transitions)
-await flow_manager.initialize(system_message)
+await flow_manager.initialize()
+
+@transport.event_handler("on_first_participant_joined")
+async def on_first_participant_joined(transport, participant):
+    await transport.capture_participant_transcription(participant["id"])
+    await flow_manager.initialize()
+    await flow_manager.set_node("initial", create_initial_node())
+    await task.queue_frames([context_aggregator.user().get_context_frame()])
 ```
 
 ## Examples

diff --git a/editor/examples/food_ordering.json b/editor/examples/food_ordering.json
@@ -2,7 +2,13 @@
   "initial_node": "start",
   "nodes": {
     "start": {
-      "messages": [
+      "role_messages": [
+        {
+          "role": "system",
+          "content": "You are an order-taking assistant. You must ALWAYS use the available functions to progress the conversation. This is a phone conversation and your responses will be converted to audio. Keep the conversation friendly, casual, and polite. Avoid outputting special characters and emojis."
+        }
+      ],
+      "task_messages": [
         {
           "role": "system",
           "content": "For this step, ask the user if they want pizza or sushi, and wait for them to use a function to choose. Start off by greeting them. Be friendly and casual; you're taking an order for food over the phone."
@@ -36,7 +42,7 @@
       ]
     },
     "choose_pizza": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "You are handling a pizza order. Use the available functions:\n\n- Use select_pizza_order when the user specifies both size AND type\n\n- Use confirm_order when the user confirms they are satisfied with their selection\n\nPricing:\n\n- Small: $10\n\n- Medium: $15\n\n- Large: $20\n\nAfter selection, confirm both the size and type, state the price, and ask if they want to confirm their order. Remember to be friendly and casual."
@@ -82,7 +88,7 @@
       ]
     },
     "choose_sushi": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "You are handling a sushi order. Use the available functions:\n\n- Use select_sushi_order when the user specifies both count AND type\n\n- Use confirm_order when the user confirms they are satisfied with their selection\n\nPricing:\n\n- $8 per roll\n\nAfter selection, confirm both the count and type, state the price, and ask if they want to confirm their order. Remember to be friendly and casual."
@@ -129,7 +135,7 @@
       ]
     },
     "confirm": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Read back the complete order details to the user and ask for final confirmation. Use the available functions:\n\n- Use complete_order when the user confirms\n\n- Use revise_order if they want to change something\n\nBe friendly and clear when reading back the order details."
@@ -151,7 +157,7 @@
       ]
     },
     "end": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Concisely end the conversation—1-3 words is appropriate. Just say 'Bye' or something similarly short."

diff --git a/editor/examples/movie_explorer.json b/editor/examples/movie_explorer.json
@@ -2,10 +2,16 @@
   "initial_node": "greeting",
   "nodes": {
     "greeting": {
-      "messages": [
+      "role_messages": [
         {
           "role": "system",
-          "content": "You are a helpful movie expert. Start by greeting the user and asking if they'd like to know about movies currently in theaters or upcoming releases. Wait for their choice before using either get_current_movies or get_upcoming_movies."
+          "content": "You are a friendly movie expert. Your responses will be converted to audio, so avoid special characters. Always use the available functions to progress the conversation naturally."
+        }
+      ],
+      "task_messages": [
+        {
+          "role": "system",
+          "content": "Start by greeting the user and asking if they'd like to know about movies currently in theaters or upcoming releases. Wait for their choice before using either get_current_movies or get_upcoming_movies."
         }
       ],
       "functions": [
@@ -38,7 +44,7 @@
       ]
     },
     "explore_movie": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Help the user learn more about movies. You can:\n\n- Use get_movie_details when they express interest in a specific movie\n\n- Use get_similar_movies to show recommendations\n\n- Use get_current_movies to see what's playing now\n\n- Use get_upcoming_movies to see what's coming soon\n\n- Use end_conversation when they're done exploring\n\nAfter showing details or recommendations, ask if they'd like to explore another movie or end the conversation."
@@ -120,7 +126,7 @@
       ]
     },
     "end": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Thank the user warmly and mention they can return anytime to discover more movies."

diff --git a/editor/examples/patient_intake.json b/editor/examples/patient_intake.json
@@ -2,7 +2,13 @@
   "initial_node": "start",
   "nodes": {
     "start": {
-      "messages": [
+      "role_messages": [
+        {
+          "role": "system",
+          "content": "You are Jessica, an agent for Tri-County Health Services. You must ALWAYS use one of the available functions to progress the conversation. Be professional but friendly."
+        }
+      ],
+      "task_messages": [
         {
           "role": "system",
           "content": "Start by introducing yourself to Chad Bailey, then ask for their date of birth, including the year. Once they provide their birthday, use verify_birthday to check it. If verified (1983-01-01), proceed to prescriptions."
@@ -31,7 +37,7 @@
       ]
     },
     "get_prescriptions": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "This step is for collecting prescriptions. Ask them what prescriptions they're taking, including the dosage. After recording prescriptions (or confirming none), proceed to allergies."
@@ -73,7 +79,7 @@
       ]
     },
     "get_allergies": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Collect allergy information. Ask about any allergies they have. After recording allergies (or confirming none), proceed to medical conditions."
@@ -111,7 +117,7 @@
       ]
     },
     "get_conditions": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Collect medical condition information. Ask about any medical conditions they have. After recording conditions (or confirming none), proceed to visit reasons."
@@ -149,7 +155,7 @@
       ]
     },
     "get_visit_reasons": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Collect information about the reason for their visit. Ask what brings them to the doctor today. After recording their reasons, proceed to verification."
@@ -187,7 +193,7 @@
       ]
     },
     "verify": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Review all collected information with the patient. Follow these steps:\n\n1. Summarize their prescriptions, allergies, conditions, and visit reasons\n\n2. Ask if everything is correct\n\n3. Use the appropriate function based on their response\n\nBe thorough in reviewing all details and wait for explicit confirmation."
@@ -221,7 +227,7 @@
       ]
     },
     "confirm": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Once confirmed, thank them, then use the complete_intake function to end the conversation."
@@ -243,7 +249,7 @@
       ]
     },
     "end": {
-      "messages": [
+      "task_messages": [
         {
           "role": "system",
           "content": "Thank them for their time and end the conversation."