Skip to content

includes 2 full fastapi webapp files 1: simple voice chat template 2: out-of-band responses voice chat template

Notifications You must be signed in to change notification settings

unseen22/Out-of-band-responses-template-for-OpenAI-real-time-voice-chat

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Real-Time Voice Chat Experiments

This repository contains two implementations of real-time voice chat applications using OpenAI's Realtime API with WebRTC.

Prerequisites

  • Python 3.8+
  • OpenAI API key with Realtime API access
  • Modern web browser with WebRTC support

Installation

  1. Clone the repository
  2. Install the required packages:
pip install fastapi uvicorn requests termcolor
  1. Set your OpenAI API key as an environment variable:
# For Linux/Mac
export OPENAI_API_KEY=your_api_key_here

# For Windows
set OPENAI_API_KEY=your_api_key_here

Applications

1. Basic Voice & Text Chat (1_basic_voice_text_chat.py)

A straightforward implementation of two-way voice and text chat with the following features:

  • Real-time voice communication using WebRTC
  • Text chat capability
  • Beautiful dark mode UI with glass-morphism effects
  • Error handling and status updates
  • Animated UI elements

To run:

python 1_basic_voice_text_chat.py

2. Enhanced Chat with Real-time Classification (2_out_of_band_responses.py)

An enhanced version that adds real-time conversation classification:

  • All features from the basic version
  • Real-time conversation classification into categories:
    • General
    • Philosophical
    • Math
    • Technology
  • Classifications for both voice and text inputs
  • Side panel showing conversation classifications with timestamps
  • Color-coded classification display
  • Out-of-band processing to maintain smooth chat experience

To run:

python 2_out_of_band_responses.py

Technical Details

WebRTC Implementation

  • Uses OpenAI's Realtime API for WebRTC signaling
  • Handles audio streams for voice communication
  • Manages data channels for text and control messages
  • Implements proper connection lifecycle management

Out-of-Band Processing (Enhanced Version)

  • Uses separate processing for classifications without affecting main conversation
  • Analyzes entire conversation context for accurate classification
  • Implements metadata-based response handling
  • Maintains conversation state independently of classifications

UI Features

  • Built with Tailwind CSS and DaisyUI
  • Responsive design
  • Glass-morphism effects
  • Smooth animations using CSS transitions
  • Dark mode optimized

Usage

  1. Start either application using the commands above
  2. Click "Connect & Start Chat"
  3. Allow microphone access when prompted
  4. Start chatting using either:
    • Voice (just speak)
    • Text (type and press Enter or click Send)
  5. For the enhanced version, watch the classifications appear in real-time

Error Handling

Both implementations include comprehensive error handling for:

  • API key issues
  • Connection problems
  • Microphone access
  • WebRTC negotiation
  • Data channel communication

Errors are:

  • Logged to the console
  • Displayed in the UI
  • Color-coded in the terminal (using termcolor)

Security Notes

  • Uses ephemeral tokens for client-side API access
  • Handles API keys securely through environment variables
  • Implements proper WebRTC security practices

Limitations

  • Maximum session duration: 30 minutes
  • Requires modern browser with WebRTC support
  • Needs stable internet connection for voice chat
  • API key must have Realtime API access enabled

Contributing

Feel free to submit issues and enhancement requests!

About

includes 2 full fastapi webapp files 1: simple voice chat template 2: out-of-band responses voice chat template

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 69.0%
  • Python 16.7%
  • HTML 10.7%
  • CSS 3.6%