Skip to content

Dub audiobooks into any language using AI. Transcribe → Translate → Synthesize with Whisper, LLMs, and ElevenLabs.

License

Notifications You must be signed in to change notification settings

DreamTeamMobile/audiodub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

audiodub

🎧 Dub audiobooks into any language using AI

Translate audiobooks end-to-end: Audio → Transcribe (Whisper) → Translate (LLM) → Synthesize (ElevenLabs) → New Audiobook

Features

  • Transcription: MLX-Whisper optimized for Apple Silicon
  • Translation: OpenAI API or Claude CLI with smart context
  • Synthesis: ElevenLabs v3 with any voice
  • Chapter detection: Automatic or size-based splitting

Requirements

  • Python 3.12+
  • macOS with Apple Silicon (for MLX-Whisper)
  • ffmpeg (brew install ffmpeg)
  • API keys: ElevenLabs, OpenAI/Anthropic

Installation

# Clone and install
git clone <repo>
cd audiobook2audiobook_translate

# Using uv (recommended)
uv sync

# Or pip
pip install -e .

Quick Start

# Set up environment
cp .env.example .env
# Edit .env with your API keys

# Full pipeline
uv run audiodub full-pipeline audiobook.mp3 \
  --voice-id YOUR_ELEVENLABS_VOICE_ID \
  --target-lang Russian \
  -v

Step-by-Step Usage

# 1. Transcribe
uv run audiodub transcribe audiobook.mp3 -o output -v

# 2. Generate smart context
uv run audiodub generate-context output -p claude-cli -v

# 3. Translate single chapter (test)
uv run audiodub translate output -c 1 -p claude-cli -v

# 4. Translate all
uv run audiodub translate output -p claude-cli -v

# 5. Synthesize
uv run audiodub synthesize output --voice-id YOUR_VOICE_ID -v

# 6. Merge
uv run audiodub merge output -v

CLI Reference

See CLI.md for full command documentation with all options and examples.

# Quick help
uv run audiodub --help
uv run audiodub <command> --help

Output Structure

output/
├── full_transcript.txt        # Complete transcript
├── context.txt                # Smart context for translation
├── final.mp3                  # Merged audiobook
└── chapters/
    ├── 001_chapter.txt        # Source text
    ├── 001_chapter_ru.txt     # Translated text
    ├── 001_chapter_ru.mp3     # Synthesized audio
    ├── 001_chapter_proofread.txt  # Side-by-side comparison
    └── ...

ElevenLabs Models

Model Quality Speed Use Case
eleven_v3 Best Slower Final production audiobooks
eleven_flash_v2_5 Good Fast Testing, budget-conscious projects
# Use flash model for faster/cheaper synthesis
uv run audiodub synthesize output --voice-id YOUR_ID --model eleven_flash_v2_5 -v

Note: Flash model has slightly less natural intonation. Recommended for testing or when optimizing costs.

Environment Variables

# .env file
ELEVENLABS_API_KEY=...
OPENAI_API_KEY=...
LLM_PROVIDER=openai-api  # or claude-cli
CLAUDE_CLI_PATH=/path/to/claude
ELEVENLABS_COST_PER_1K=0.18  # For cost tracking (Creator: 0.30, Pro: 0.18, Scale: 0.11)

License

MIT

About

Dub audiobooks into any language using AI. Transcribe → Translate → Synthesize with Whisper, LLMs, and ElevenLabs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages