A high-performance real-time AI framework for audio and video processing
Overview
Realtime AI is a WebRTC-based framework for building low-latency AI applications with audio and video. It features a modular pipeline architecture inspired by GStreamer, enabling you to compose processing elements for speech recognition, LLM interactions, and text-to-speech.
git clone https://github.com/realtime-ai/realtime-ai.git
cd realtime-ai
go mod download
Run Example
# Set API key
export GOOGLE_API_KEY="your_api_key"
# Run Gemini assistant
go run examples/gemini-assis/main.go
# Open browser
open http://localhost:8080
Basic Usage
// Create pipeline
pipeline := pipeline.NewPipeline("assistant")
// Add and link elements
resample := elements.NewAudioResampleElement("resample")
gemini := elements.NewGeminiElement("gemini", apiKey)
audioPacer := elements.NewAudioPacerSinkElement("audioPacer")
pipeline.Link(resample, gemini)
pipeline.Link(gemini, audioPacer)
// Start processing
pipeline.Start(ctx)
Documentation
CLAUDE.md - Development guide and architecture details
Realtime AI
A high-performance real-time AI framework for audio and video processing
Overview
Realtime AI is a WebRTC-based framework for building low-latency AI applications with audio and video. It features a modular pipeline architecture inspired by GStreamer, enabling you to compose processing elements for speech recognition, LLM interactions, and text-to-speech.
Architecture:
Features
Quick Start
Installation
macOS:
Ubuntu/Debian (推荐使用安装脚本):
Ubuntu/Debian (手动安装):
Setup:
Run Example
Basic Usage
Documentation
Project Structure
License
Apache License 2.0 - see LICENSE for details.
Status
⚠️ Active Development - APIs may change without notice.