Benchmark: FunASR vs Whisper — Real-World Performance on Chinese Meeting Audio #2947
LauraGPT
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
TL;DR
Test Setup
Key Findings
1. Speed: FunASR is 13x faster than Whisper on GPU
SenseVoice-Small processes 10 seconds of audio in just 70ms. On CPU, FunASR models run at 17x realtime — faster than Whisper runs on GPU.
2. Chinese Accuracy: FunASR wins on dialects and accents
FunASR supports 7 Chinese dialects (Wu, Cantonese, Min, Hakka, Gan, Xiang, Jin) and 26 regional accents. Whisper struggles with non-Mandarin Chinese.
3. All-in-one Pipeline: No assembly required
With Whisper, you need to combine 5 separate projects for a production pipeline:
FunASR does all this in one API call:
4. OpenAI-Compatible API: Zero-code migration
If you are already using OpenAI Whisper API or any compatible client, switching to FunASR requires zero code changes:
pip install funasr vllm fastapi uvicorn python-multipart funasr-server --device cuda # Now POST to http://localhost:8000/v1/audio/transcriptions5. Self-hosted & Free
How to Try
Links
Beta Was this translation helpful? Give feedback.
All reactions