Skip to content

Conversation

@rafallezanko
Copy link
Contributor

@rafallezanko rafallezanko commented Jan 19, 2026

https://learn.microsoft.com/en-us/javascript/api/microsoft-cognitiveservices-speech-sdk/propertyid?view=azure-node-latest

SpeechServiceResponse_PostProcessingOption = 39 | A string value specifying which post processing option should be used by service. Allowed values are "TrueText". Added in version 1.7.0

Summary by CodeRabbit

  • New Features
    • Added an optional TrueText post‑processing toggle for the Azure speech‑to‑text plugin. When enabled per instance at initialization, transcriptions receive TrueText formatting/cleanup to improve readability and accuracy. Disabled by default and can be turned on for individual speech‑to‑text instances to enhance output quality.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 19, 2026

📝 Walkthrough

Walkthrough

Adds a boolean true_text_post_processing option to the Azure STT plugin: exposed on STT.__init__, stored in STTOptions, and applied in _create_speech_recognizer() by setting the SpeechConfig post-processing option to "TrueText" when enabled.

Changes

Cohort / File(s) Summary
Azure STT plugin
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
Added true_text_post_processing: bool = False to STTOptions; added true_text_post_processing parameter to STT.__init__ and propagated it into STTOptions; updated _create_speech_recognizer() to set SpeechConfig's post-processing option to "TrueText" when the flag is truthy.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Poem

🐇 I added a switch with a joyful hop,
TrueText whispers and cleans each drop.
A tiny flag to guide the stream,
Words made tidy, neat, and gleam. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately and specifically describes the main change: adding the TrueText post-processing option to Azure STTOptions, which is the primary focus of the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f363d24 and c09cfdd.

📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: unit-tests

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py (1)

407-413: Use the correct PropertyId enum: SpeechServiceResponse_PostProcessingOption instead of PostProcessingOption.

The enum speechsdk.enums.PropertyId.PostProcessingOption does not exist in the Azure Speech SDK (1.43.0+). The correct enum name is SpeechServiceResponse_PostProcessingOption (ID: 4003). The current code will raise an AttributeError at runtime.

Fix
-        speech_config.set_property(speechsdk.enums.PropertyId.PostProcessingOption, "TrueText")
+        speech_config.set_property(
+            speechsdk.enums.PropertyId.SpeechServiceResponse_PostProcessingOption,
+            "TrueText",
+        )
🧹 Nitpick comments (1)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py (1)

83-102: Document the new option in the constructor docstring.

The new public parameter isn’t described yet, which makes the API harder to discover.

📌 Suggested docstring update
@@
         Args:
             phrase_list: List of words or phrases to boost recognition accuracy.
                         Azure will give higher priority to these phrases during recognition.
             explicit_punctuation: Controls punctuation behavior. If True, enables explicit punctuation mode
                         where punctuation marks are added explicitly. If False (default), uses Azure's
                         default punctuation behavior.
+            true_text_post_processing: Enables Azure "TrueText" post-processing in the recognition result.

As per coding guidelines, maintain Google-style docstrings for public APIs.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0722371 and bdf13ad.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (1)
  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py

📄 CodeRabbit inference engine (AGENTS.md)

**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings

Files:

  • livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: type-check (3.13)
  • GitHub Check: type-check (3.9)
  • GitHub Check: unit-tests
🔇 Additional comments (2)
livekit-plugins/livekit-plugins-azure/livekit/plugins/azure/stt.py (2)

41-62: LGTM: default-off option is sensible.

The new true_text_post_processing field is a safe, backward-compatible addition.


141-157: LGTM: option is correctly propagated into STTOptions.

Wiring looks consistent with the other options.

✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.

@rafallezanko rafallezanko force-pushed the main branch 2 times, most recently from fcd3953 to f4346aa Compare January 19, 2026 15:49
@rafallezanko
Copy link
Contributor Author

Hi @chenghao-mou , can you have a look at the MR? Thank you in advance :)

@chenghao-mou chenghao-mou self-assigned this Jan 20, 2026
@chenghao-mou
Copy link
Member

/test-stt

@github-actions
Copy link
Contributor

STT Test Results

Status: ✗ Some tests failed

Metric Count
✓ Passed 22
✗ Failed 1
× Errors 1
→ Skipped 15
▣ Total 39
⏱ Duration 240.1s
Failed Tests
  • tests.test_stt::test_stream[livekit.agents.inference]
    stt_factory = <function parameter_factory.<locals>.<lambda> at 0x7fcd92109440>
    request = <FixtureRequest for <Coroutine test_stream[livekit.agents.inference]>>
    
        @pytest.mark.usefixtures("job_process")
        @pytest.mark.parametrize("stt_factory", STTs)
        async def test_stream(stt_factory: Callable[[], STT], request):
            sample_rate = SAMPLE_RATE
            plugin_id = request.node.callspec.id.split("-")[0]
            frames, transcript, _ = await make_test_speech(chunk_duration_ms=10, sample_rate=sample_rate)
      
            # TODO: differentiate missing key vs other errors
            try:
                stt_instance: STT = stt_factory()
            except ValueError as e:
                pytest.skip(f"{plugin_id}: {e}")
      
            async with stt_instance as stt:
                label = f"{stt.model}@{stt.provider}"
                if not stt.capabilities.streaming:
                    pytest.skip(f"{label} does not support streaming")
      
                for attempt in range(MAX_RETRIES):
                    try:
                        state = {"closing": False}
      
                        async def _stream_input(
                            frames: list[rtc.AudioFrame], stream: RecognizeStream, state: dict = state
                        ):
                            for frame in frames:
                                stream.push_frame(frame)
                                await asyncio.sleep(0.005)
      
                            stream.end_input()
                            state["closing"] = True
      
                        async def _stream_output(stream: RecognizeStream, state: dict = state):
                            text = ""
                            # make sure the events are sent in the right order
                            recv_start, recv_end = False, True
                            start_time = time.time()
                            got_final_transcript = False
      
                            async for event in stream:
                                if event.type == agents.stt.SpeechEventType.START_OF_SPEECH:
    
  • tests.test_stt::test_stream[livekit.plugins.aws]
    def finalizer() -> None:
            """Yield again, to finalize."""
      
            async def async_finalizer() -> None:
                try:
                    await gen_obj.__anext__()  # type: ignore[union-attr]
                except StopAsyncIteration:
                    pass
                else:
                    msg = "Async generator fixture didn't stop."
                    msg += "Yield only once."
                    raise ValueError(msg)
      
            task = _create_task_in_context(event_loop, async_finalizer(), context)
    >       event_loop.run_until_complete(task)
    
    .venv/lib/python3.12/site-packages/pytest_asyncio/plugin.py:347: 
    _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
    
    self = <_UnixSelectorEventLoop running=False closed=True debug=False>
    future = <Task finished name='Task-122' coro=<_wrap_asyncgen_fixture.<locals>._asyncgen_fixture_wrapper.<locals>.finalizer.<loc... File "/home/runner/work/agents/agents/.venv/lib/python3.12/site-packages/smithy_http/aio/crt.py", line 104, in chunks>
    
        def run_until_complete(self, future):
            """Run until the Future is done.
      
            If the argument is a coroutine, it is wrapped in a Task.
      
            WARNING: It would be disastrous to call run_until_complete()
            with the same coroutine twice -- it would wrap it in two
            different Tasks and that can't be good.
      
            Return the Future's result, or raise its exception.
            """
            self._check_closed()
            self._check_running()
      
            new_task = not futures.isfuture(future)
            future = tasks.ensure_future(future, loop=self)
            if new_task:
                # An exception is raised if the future didn't complete, so there
                # is no need to log the "destroy pending task" message
                future._log_destroy_pending = False
      
            future.add_done_callback(_run_until_complete_cb)
            try:
                self.run_forever()
            except:
                if new_task and future.done() and not future.canc
    
Skipped Tests
Test Reason
tests.test_stt::test_recognize[livekit.plugins.assemblyai] universal-streaming-english@AssemblyAI does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.speechmatics] unknown@Speechmatics does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.fireworksai] unknown@FireworksAI does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.nvidia] unknown@unknown does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.aws] unknown@Amazon Transcribe does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.cartesia] ink-whisper@Cartesia does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.soniox] stt-rt-v3@Soniox does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.deepgram.STTv2] flux-general-en@Deepgram does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.gradium.STT] unknown@Gradium does not support batch recognition
tests.test_stt::test_recognize[livekit.agents.inference] unknown@livekit does not support batch recognition
tests.test_stt::test_recognize[livekit.plugins.azure] unknown@Azure STT does not support batch recognition
tests.test_stt::test_stream[livekit.plugins.elevenlabs] scribe_v1@ElevenLabs does not support streaming
tests.test_stt::test_stream[livekit.plugins.fal] Wizper@Fal does not support streaming
tests.test_stt::test_stream[livekit.plugins.mistralai] voxtral-mini-latest@MistralAI does not support streaming
tests.test_stt::test_stream[livekit.plugins.openai] [email protected] does not support streaming

Triggered by workflow run #378

Copy link
Member

@chenghao-mou chenghao-mou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Confirmed it removed disfluencies when enabled.

But the uv.lock change seems unnecessary; we should drop it.

@rafallezanko
Copy link
Contributor Author

Good, the uv.lock change is already merged in a master branch. I've synced my branch and the change disappeared.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants