This guide will help you install and set up VoiceTransor on your system.
中文安装指南 (Chinese Installation Guide)
- Download
- System Requirements
- Install VoiceTransor
- Install FFmpeg (Required)
- Install Ollama (Optional)
- Verify Installation
- Troubleshooting
Download the latest release for your platform:
Windows:
VoiceTransor-v0.9.0-Windows-x64.zip(~4GB) - Universal build for all systems- OR
VoiceTransor-v0.9.0-Windows-x64-Setup.exe(~450MB) - Installer version
VoiceTransor uses a single build that works for everyone:
- ✅ Have NVIDIA GPU? Automatically uses CUDA acceleration
- ✅ No GPU? Automatically uses CPU (works perfectly, just slower)
- ✅ No need to choose between CPU/GPU versions
- ✅ Same installer works on all Windows 10+ systems
This is similar to how Ollama works - one download, automatic hardware detection.
-
Operating System:
- Windows 10 or later (64-bit)
- macOS 10.15 (Catalina) or later
- Linux (Ubuntu 20.04+, Debian 11+, or equivalent)
-
Hardware:
- 8GB RAM
- 5GB free disk space (for application and models)
- Dual-core processor
- Hardware:
- 16GB RAM
- 10GB free disk space
- Quad-core processor or better
-
NVIDIA GPU:
- GTX 900 series or newer (GTX 1050, RTX 20/30/40 series, etc.)
- 4GB+ VRAM
- Driver version >= 525.60.13
-
Apple Silicon:
- M1, M2, or M3 chip
- Automatically detected and used
Note: GPU acceleration is optional. The application works perfectly fine on CPU, just slower.
-
Extract the archive:
- Right-click
VoiceTransor-Windows.zip - Select "Extract All..."
- Choose a destination folder (e.g.,
C:\Program Files\VoiceTransor)
- Right-click
-
Launch the application:
- Navigate to the extracted folder
- Double-click
VoiceTransor.exe
-
Windows Security Warning:
- If you see "Windows protected your PC", click "More info"
- Then click "Run anyway"
- This is normal for unsigned applications
-
Extract the archive:
- Double-click
VoiceTransor-macOS.zip - Move
VoiceTransor.appto Applications folder
- Double-click
-
First launch:
- Right-click
VoiceTransor.appand select "Open" - Click "Open" in the security dialog
- Right-click
-
Grant permissions:
- Allow access to files when prompted
-
Extract the archive:
unzip VoiceTransor-Linux.zip -d ~/Applications/VoiceTransor cd ~/Applications/VoiceTransor
-
Make executable:
chmod +x VoiceTransor
-
Launch:
./VoiceTransor
VoiceTransor requires FFmpeg to process audio files. You must install it separately.
Option 1: Automatic (Recommended)
- Download FFmpeg from: https://www.gyan.dev/ffmpeg/builds/
- Download "ffmpeg-release-essentials.zip"
- Extract to
C:\ffmpeg - Add to PATH:
- Press
Win + R, typesysdm.cpl, press Enter - Go to "Advanced" tab
- Click "Environment Variables"
- Under "System variables", find "Path" and click "Edit"
- Click "New"
- Add:
C:\ffmpeg\bin - Click "OK" on all windows
- Press
- Restart your computer (or at least log out and back in)
Option 2: Using Package Manager
If you have Chocolatey:
choco install ffmpegIf you have Scoop:
scoop install ffmpegVerify installation:
ffmpeg -versionUsing Homebrew (Recommended):
brew install ffmpegVerify installation:
ffmpeg -versionUbuntu/Debian:
sudo apt update
sudo apt install ffmpegFedora:
sudo dnf install ffmpegArch Linux:
sudo pacman -S ffmpegVerify installation:
ffmpeg -versionOllama enables AI-powered text processing (summarize, translate, extract key points, etc.). This is optional but recommended.
- Local AI models (no cloud, your data stays private)
- Works on both CPU and GPU
- Required for text processing features in VoiceTransor
Windows:
- Download installer from: https://ollama.com/download
- Run the installer
- Open Command Prompt and verify:
ollama --version
macOS:
- Download from: https://ollama.com/download
- Install the .dmg file
- Verify in Terminal:
ollama --version
Linux:
curl -fsSL https://ollama.com/install.sh | shAfter installing Ollama:
-
Start Ollama service (if not auto-started):
ollama serve
-
Pull a model (in a new terminal):
# For English ollama pull llama3.1:8b # For Chinese/English ollama pull qwen2.5:7b
-
Model sizes:
llama3.1:8b- ~4.7GBqwen2.5:7b- ~4.4GB
Note: Models are downloaded to:
- Windows:
%USERPROFILE%\.ollama\models - macOS/Linux:
~/.ollama/models
- Launch VoiceTransor
- Try importing a small audio file
- If FFmpeg is working, you should see audio information
- Go to transcription settings
- Select Device: "auto" or "cuda"
- Start a transcription
- Check the logs - should mention using CUDA
If GPU is not detected, the app will automatically use CPU.
- In VoiceTransor, try "Run Text Operation"
- Select a preset (e.g., "Summarize")
- If Ollama is running and has a model, it should work
Cause: FFmpeg is not installed or not in PATH.
Solution:
- Verify FFmpeg is installed:
ffmpeg -version - If not found, reinstall FFmpeg
- Make sure FFmpeg is in your PATH
- Restart VoiceTransor (or your computer)
Cause: Ollama service is not started.
Solution:
- Open a terminal
- Run:
ollama serve - Keep this terminal open
- Try again in VoiceTransor
Auto-start Ollama (Optional):
- Windows: Create a scheduled task
- macOS: Add to Login Items
- Linux: Enable systemd service
Windows:
- Try running as Administrator
- Check Windows Defender hasn't blocked it
macOS:
- Go to System Preferences → Security & Privacy
- Click "Open Anyway"
Linux:
- Check file permissions:
chmod +x VoiceTransor - Install required libraries:
sudo apt install libxcb-cursor0
Check your GPU:
# NVIDIA
nvidia-smiUpdate drivers:
- NVIDIA: Download from https://www.nvidia.com/drivers
- Minimum version: 525.60.13 for CUDA 12.1
Don't worry if GPU doesn't work:
- The app will automatically use CPU
- Everything will still work, just slower
Once installation is complete:
- Read the User Guide for detailed usage instructions
- Try your first transcription
- Explore AI text processing with Ollama
If you encounter issues not covered here:
- Check USER_GUIDE.md - Troubleshooting
- Email: voicetransor@gmail.com
Happy transcribing! 🎉