|
| 1 | +<!DOCTYPE html> |
| 2 | +<html lang="en"><head><meta charset="UTF-8"><meta name="viewport" content="width=device-width,initial-scale=1.0"> |
| 3 | +<meta name="description" content="Choose the right FunASR path for private speech APIs, agent voice input, streaming ASR, vLLM acceleration, subtitles, batch transcription, and benchmarks."> |
| 4 | +<meta property="og:title" content="FunASR Use Cases"> |
| 5 | +<meta property="og:description" content="A practical route map for deploying FunASR in products, agents, streaming services, and benchmark-driven migrations."> |
| 6 | +<meta property="og:type" content="website"> |
| 7 | +<meta property="og:url" content="https://modelscope.github.io/FunASR/use-cases.html"> |
| 8 | +<title>FunASR Use Cases</title> |
| 9 | +<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600;700;800&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet"> |
| 10 | +<link rel="stylesheet" href="style.css"> |
| 11 | +</head><body> |
| 12 | +<nav class="nav"><div class="container"> |
| 13 | +<a href="index.html" class="nav-logo">FunASR</a> |
| 14 | +<div class="nav-links"><a href="index.html">Home</a><a href="tutorial.html">Tutorial</a><a href="training.html">Training</a><a href="model-registration.html">Develop</a><a href="api.html">API</a><a href="vllm.html">vLLM</a><a href="agent.html">Agent</a><a href="benchmark.html">Benchmark</a></div> |
| 15 | +<div class="lang-dropdown"><button class="lang-btn">English</button><div class="lang-menu"><a href="use-cases.html" class="current">English</a><a href="zh/use-cases.html">中文</a><a href="ja/index.html">日本語</a></div></div> |
| 16 | +<a href="https://github.com/modelscope/FunASR" class="nav-github">GitHub</a> |
| 17 | +</div></nav> |
| 18 | +<div class="content"><div class="container narrow"> |
| 19 | +<h1>Use Cases</h1> |
| 20 | +<p>Pick the shortest path from evaluation to production. FunASR covers local transcription, private OpenAI-compatible APIs, agent voice input, streaming services, vLLM acceleration, subtitles, and batch processing.</p> |
| 21 | +<div class="toc-grid"><a href="#paths">Choose a path</a><a href="#recipes">Production recipes</a><a href="#models">Model hints</a><a href="#share">Share results</a></div> |
| 22 | +<section id="paths"><h2>Choose the right path</h2> |
| 23 | +<table><tr><th>Goal</th><th>Start here</th><th>Why it matters</th></tr> |
| 24 | +<tr><td>Transcribe one file locally</td><td><a href="tutorial.html">Tutorial</a></td><td>Verify install and model download in minutes.</td></tr> |
| 25 | +<tr><td>Compare accuracy and speed</td><td><a href="benchmark.html">Benchmark report</a></td><td>Review long-audio speed and CER before choosing a model.</td></tr> |
| 26 | +<tr><td>Build a private speech API</td><td><a href="agent.html#server">OpenAI-compatible API</a></td><td>Reuse OpenAI-style clients without sending audio to a cloud ASR provider.</td></tr> |
| 27 | +<tr><td>Add speech input to agents</td><td><a href="agent.html#mcp">MCP server</a></td><td>Connect local ASR to Claude, Cursor, desktop tools, and internal assistants.</td></tr> |
| 28 | +<tr><td>Serve streaming ASR</td><td><a href="tutorial.html#real-time-speech-recognition">Realtime examples</a></td><td>Handle live captioning, meetings, and call-center style workloads.</td></tr> |
| 29 | +<tr><td>Accelerate LLM-based ASR</td><td><a href="vllm.html">vLLM guide</a></td><td>Use tensor parallel decoding and streaming service support for Fun-ASR-Nano.</td></tr> |
| 30 | +<tr><td>Generate subtitles</td><td><a href="agent.html#subtitle">Subtitle generator</a></td><td>Create SRT/VTT files from audio or video, with speaker labels when needed.</td></tr> |
| 31 | +<tr><td>Process many recordings</td><td><a href="https://github.com/modelscope/FunASR/blob/main/examples/batch_asr_improved.py">Batch ASR example</a></td><td>Build repeatable offline jobs for archives, meetings, and datasets.</td></tr></table> |
| 32 | +</section> |
| 33 | +<section id="recipes"><h2>Production recipes</h2> |
| 34 | +<div class="grid-2"><div class="card"><h3>Private transcription API</h3><p>Use this path when an application already speaks OpenAI-style APIs or when audio cannot leave your environment.</p><pre><code>pip install funasr fastapi uvicorn python-multipart |
| 35 | +funasr-server --model sensevoice --device cuda |
| 36 | + |
| 37 | +curl http://localhost:8000/v1/audio/transcriptions \ |
| 38 | + -F file=@sample.wav \ |
| 39 | + -F model=sensevoice \ |
| 40 | + -F response_format=verbose_json</code></pre></div> |
| 41 | +<div class="card"><h3>Agent speech input</h3><p>Start from the MCP server when you want to talk to coding agents, internal assistants, or workflow tools.</p><pre><code>pip install funasr |
| 42 | +python examples/mcp_server/funasr_mcp.py |
| 43 | + |
| 44 | +# Set FUNASR_DEVICE=cuda for GPU inference</code></pre></div></div> |
| 45 | +<div class="grid-2"><div class="card"><h3>Streaming workloads</h3><p>Pair ASR with VAD, punctuation, and speaker diarization when partial transcripts need to be readable by humans.</p><p>Validate with real audio: background noise, long silence, overlapping speakers, and different microphone quality.</p></div> |
| 46 | +<div class="card"><h3>Benchmark before migration</h3><p>Compare FunASR against Whisper or cloud ASR using your own sample set. Track throughput, CPU viability, download size, and deployment complexity together.</p><p><a href="benchmark.html">Open the public benchmark report</a></p></div></div> |
| 47 | +</section> |
| 48 | +<section id="models"><h2>Model selection hints</h2> |
| 49 | +<table><tr><th>Need</th><th>Good first choice</th><th>Notes</th></tr> |
| 50 | +<tr><td>Fast multilingual transcription</td><td>SenseVoice-Small</td><td>Strong default for local demos and private APIs.</td></tr> |
| 51 | +<tr><td>Mandarin production ASR</td><td>Paraformer-Large</td><td>Mature choice for Chinese speech recognition.</td></tr> |
| 52 | +<tr><td>LLM-based ASR experiments</td><td>Fun-ASR-Nano</td><td>Pair with <a href="vllm.html">vLLM</a> when throughput matters.</td></tr> |
| 53 | +<tr><td>Speaker-aware transcripts</td><td>SenseVoice or Paraformer with <code>spk_model="cam++"</code></td><td>Useful for meetings, interviews, and customer calls.</td></tr> |
| 54 | +<tr><td>Live audio</td><td>Runtime WebSocket service</td><td>Validate chunking, VAD, and endpointing with real traffic.</td></tr></table> |
| 55 | +</section> |
| 56 | +<section id="share"><h2>Share your result</h2> |
| 57 | +<p>If FunASR works well in your project, share the use case, model, device, processing speed, audio domain, and a public demo or benchmark summary when possible.</p> |
| 58 | +<p><a href="https://github.com/modelscope/FunASR/issues">Open an issue</a> or <a href="https://github.com/modelscope/FunASR/discussions">start a discussion</a>. Concrete usage reports help new users choose the right path and help maintainers prioritize docs and examples.</p> |
| 59 | +</section> |
| 60 | +</div></div> |
| 61 | +<footer><p>FunASR · Tongyi Lab, Alibaba Group</p><p><a href="index.html">Home</a> · <a href="use-cases.html">Use Cases</a> · <a href="agent.html">Agent</a> · <a href="benchmark.html">Benchmark</a> · <a href="vllm.html">vLLM</a> · <a href="https://github.com/modelscope/FunASR">GitHub</a></p></footer> |
| 62 | +</body></html> |
0 commit comments