feat: streaming TTS generation and persistent voice library by ice-product96 · Pull Request #166 · k2-fsa/OmniVoice

ice-product96 · 2026-05-23T09:40:48Z

Add VoiceLibrary class (omnivoice/utils/voice_library.py): pre-clone a voice once, save it by name to disk (~/.omnivoice/voices/), and reuse instantly without re-uploading reference audio.
Add OmniVoice.generate_streaming() method: generator that yields (audio_1d, status) tuples as each chunk is decoded, enabling progressive playback in the UI for long texts. Short texts still complete in a single pass.
Rewrite Gradio demo (omnivoice/cli/demo.py) with three tabs:
- Voice Clone — saved-voice dropdown, streaming output, quick-save panel.
- Voice Design — unchanged.
- Voice Library — clone & name any audio, manage (list/delete) saved voices; changes propagate live to all dropdowns.
Export VoiceClonePrompt and VoiceLibrary from omnivoice/init.py.

- Add `VoiceLibrary` class (omnivoice/utils/voice_library.py): pre-clone a voice once, save it by name to disk (~/.omnivoice/voices/), and reuse instantly without re-uploading reference audio. - Add `OmniVoice.generate_streaming()` method: generator that yields (audio_1d, status) tuples as each chunk is decoded, enabling progressive playback in the UI for long texts. Short texts still complete in a single pass. - Rewrite Gradio demo (omnivoice/cli/demo.py) with three tabs: * Voice Clone — saved-voice dropdown, streaming output, quick-save panel. * Voice Design — unchanged. * Voice Library — clone & name any audio, manage (list/delete) saved voices; changes propagate live to all dropdowns. - Export `VoiceClonePrompt` and `VoiceLibrary` from omnivoice/__init__.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

New package omnivoice/api/server.py: - GET /health, /info — system status & model info - GET /voices — list all saved voice clones - GET /voices/{name} — get single voice metadata - POST /voices/clone — clone voice from uploaded audio - DELETE /voices/{name} — delete a voice - PATCH /voices/{name} — rename a voice - POST /generate — sync TTS → complete WAV file - POST /generate/stream — chunked streaming (PCM frames) - WS /ws/generate — WebSocket real-time streaming - Interactive docs at /docs (Swagger UI) - Optional X-Api-Key auth, CORS, single GPU lock Add omnivoice-api CLI entry point and [api] optional deps group. Add API_INTEGRATION.md with full usage guide. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Covers all endpoints with parameters, response formats, binary framing protocol, JS/Python client examples and Docker setup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ice-product96 and others added 3 commits May 23, 2026 14:39

docs: add API_ENDPOINTS.md with full endpoint reference

bdba8c4

Covers all endpoints with parameters, response formats, binary framing protocol, JS/Python client examples and Docker setup. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: streaming TTS generation and persistent voice library#166

feat: streaming TTS generation and persistent voice library#166
ice-product96 wants to merge 3 commits into
k2-fsa:masterfrom
ice-product96:feature/streaming-voice-library

ice-product96 commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ice-product96 commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant