AI Voice Bridge

Um gateway de voz modular e de baixa latência que conecta áudio em tempo real entre clientes e a API Gemini Live do Google.

Funcionalidades

🎙️ Streaming de voz em tempo real via WebSocket
🔄 Dois modos de conexão: On-Demand (walkie-talkie) ou Always-On (persistente)
📝 Legendas automáticas das respostas da IA
🔌 Agnóstico de cliente - funciona com qualquer cliente WebSocket (Unreal, Unity, Web, etc.)
⚡ Baixa latência - streaming de áudio direto, sem processamento intermediário

Início Rápido

1. Instalar

# Clone e configure
git clone https://github.com/your-username/ai-voice-bridge.git
cd ai-voice-bridge

# Crie o ambiente virtual
python -m venv .venv
.\.venv\Scripts\Activate.ps1  # Windows
# source .venv/bin/activate   # Linux/Mac

# Instale
pip install -e .

2. Configurar

cp .env.example .env
# Edite .env e adicione sua GOOGLE_API_KEY

3. Executar

python -m ai_voice_bridge

Configuração

Variável	Padrão	Descrição
`GOOGLE_API_KEY`	obrigatório	Sua chave da API Gemini
`CONNECTION_MODE`	`ON_DEMAND`	`ON_DEMAND` ou `ALWAYS_ON`
`GEMINI_MODEL`	`gemini-2.0-flash-exp`	Modelo Gemini a usar
`GEMINI_VOICE`	`Aoede`	Voz para respostas da IA
`WS_PORT`	`8765`	Porta do servidor WebSocket

Protocolo WebSocket

Cliente → Bridge

{"type": "start_talking"}
{"type": "stop_talking"}
// Frames binários: PCM 16-bit, 16kHz, Mono

Bridge → Cliente

{"type": "connected"}
{"type": "ready"}
{"type": "speaking", "value": true}
{"type": "subtitle", "text": "Olá!"}
{"type": "turn_complete"}
// Frames binários: PCM 16-bit, 24kHz, Mono

Arquitetura

┌─────────────┐     WebSocket     ┌─────────────┐     WebSocket     ┌─────────────┐
│   Cliente   │◄────────────────►│ VoiceBridge │◄──────────────────►│ Gemini Live │
│  (Unreal)   │  áudio + controle │  (Python)   │   áudio + eventos  │     API     │
└─────────────┘                   └─────────────┘                    └─────────────┘

Créditos

Este projeto foi inspirado e construído com base em ideias de:

Jordan Gibbs e o projeto Hypercheap - cujo trabalho em streaming de voz com IA em tempo real forneceu insights valiosos para esta implementação.

Licença

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.agent/rules		.agent/rules
src/ai_voice_bridge		src/ai_voice_bridge
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Voice Bridge

Funcionalidades

Início Rápido

1. Instalar

2. Configurar

3. Executar

Configuração

Protocolo WebSocket

Cliente → Bridge

Bridge → Cliente

Arquitetura

Créditos

Licença

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Voice Bridge

Funcionalidades

Início Rápido

1. Instalar

2. Configurar

3. Executar

Configuração

Protocolo WebSocket

Cliente → Bridge

Bridge → Cliente

Arquitetura

Créditos

Licença

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages