-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Add Intelligent Interruption Handling #4543
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added detailed documentation for the interruption filter feature, including usage examples, configuration options, and implementation details.
Added a comprehensive proof of functionality for the interruption filter, including test execution logs, results, scenarios, and evaluation criteria.
Update test script name in README.
This script tests the functionality of the InterruptionFilter class by simulating various user inputs and agent states. It includes multiple test cases to validate the filter's behavior in different scenarios.
Added interruption filter to manage interruptions during speech.
Implements an interruption filter to manage user interruptions based on agent speaking status and predefined ignore words.
Add unit and integration tests for InterruptionFilter class
Refactor tests for InterruptionFilter class to use pytest framework and remove direct execution of interruption_filter.py.
Refactor InterruptionFilter to use set instead of Set for ignore words and improve logging.
Refactor InterruptionFilter class and its tests to improve structure and logging.
Add check for normalized phrase in ignore words list
📝 WalkthroughWalkthroughAdds an InterruptionFilter, integrates it into voice agent interruption logic, exposes session options to configure ignore words and enablement, updates public exports, and delivers tests, docs, and examples for backchannel filtering. (50 words) Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant AudioActivity
participant InterruptionFilter
participant AgentActivity
User->>AudioActivity: speak (audio captured)
AudioActivity->>AudioActivity: transcribe -> transcript
AudioActivity->>AgentActivity: query agent speaking state
AgentActivity-->>AudioActivity: agent_is_speaking
alt agent_is_speaking == true
AudioActivity->>InterruptionFilter: should_ignore_interruption(transcript, true)
alt Filter returns true
InterruptionFilter-->>AudioActivity: ignore (rgba(0,128,0,0.5))
AudioActivity->>AgentActivity: skip interruption (log ignored)
else Filter returns false
InterruptionFilter-->>AudioActivity: allow (rgba(255,165,0,0.5))
AudioActivity->>AgentActivity: start user activity / interrupt
end
else agent_is_speaking == false
AudioActivity->>InterruptionFilter: should_ignore_interruption(transcript, false)
InterruptionFilter-->>AudioActivity: allow (rgba(0,0,255,0.5))
AudioActivity->>AgentActivity: start user activity / interrupt
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
🤖 Fix all issues with AI agents
In `@FILTER_README.md`:
- Around line 99-101: The README incorrectly references interruption_filter.py;
update the documentation to point to the actual implementation file filter.py
and ensure the integration reference remains correct by keeping the mention of
agent_activity.py and its _interrupt_by_audio_activity() method; replace
"interruption_filter.py" with "filter.py" in the Implementation Details section
so readers can locate the correct module.
In `@livekit-agents/livekit/agents/voice/agent_activity.py`:
- Line 68: The import in agent_activity.py is incorrect: it tries to import
InterruptionFilter from livekit.agents.voice.interruption_filter but the module
is named filter.py; update the import statement in agent_activity.py to import
InterruptionFilter from the correct module (e.g., replace the current "from
.interruption_filter import InterruptionFilter" with an import from ".filter" so
that the InterruptionFilter symbol is resolved).
In `@livekit-agents/livekit/agents/voice/agent_session.py`:
- Around line 165-166: Update the docstring for the constructor/method in
livekit/agents/voice/agent_session.py (the function with parameters
interruption_filter_enabled and interruption_ignore_words) by adding Args
entries describing these two new parameters: explain interruption_filter_enabled
(bool) as whether the interruption filter that ignores backchanneling words is
enabled (default True) and interruption_ignore_words (list[str], optional) as a
custom list of backchanneling words to ignore (None uses the default set); use
the suggested wording provided in the review to match style and place them in
the existing Args section alongside the other parameter docs.
In `@livekit-agents/livekit/agents/voice/filter.py`:
- Around line 95-129: The _is_backchanneling function currently returns True
unconditionally after checking normalized_phrase, making the subsequent per-word
loop unreachable and causing all non-empty text to be treated as backchanneling;
fix it by changing the control flow so that you only return True immediately
when normalized_phrase is in self._ignore_words (keep the logger call),
otherwise continue to the for word in words loop to check each word against
self._ignore_words, returning False if any word is not in the ignore set and
returning True only after the loop if all words are ignored (also keep the final
logger call); reference symbols: _is_backchanneling, normalized_phrase, words,
self._ignore_words, logger.
In `@test.py`:
- Line 1: Remove the unused top-level import statement "import os" from the file
(the unused symbol 'os') so the module no longer imports an unused dependency;
after removal, run the project's linter/formatter to ensure no remaining import
warnings.
In `@tests/filter.py`:
- Around line 17-37: DEFAULT_IGNORE_WORDS contains multi-word entries (e.g.,
"got it") that _is_backchanneling currently won't match because it splits input
into individual tokens; update either the data or the matching logic so phrases
are detected. Two concise fixes: (1) Replace multi-word entries in
DEFAULT_IGNORE_WORDS (and the similar set referenced later) with their component
words (e.g., "got", "it") so single-token checks in _is_backchanneling work; or
(2) enhance _is_backchanneling to also check for multi-word phrases by
constructing n-grams from the tokenized input and comparing joined n-grams
against DEFAULT_IGNORE_WORDS (ensure this logic is used wherever
_is_backchanneling is called).
- Around line 1-6: The _is_backchanneling method currently returns True
prematurely after joining words, making the subsequent per-word validation
unreachable; update livekit-agents/livekit/agents/voice/filter.py's
_is_backchanneling to mirror the correct implementation in tests/filter.py by
iterating over split words (use the same backchannel_terms set/lookup) and only
return True if every word is a backchannel term, returning False immediately if
any word is not; after fixing, remove the duplicate implementation from
tests/filter.py unless it's intentionally used as a unit-test fixture—if it is,
keep it but ensure it matches the corrected logic in the production
_is_backchanneling function.
🧹 Nitpick comments (2)
tests/filter.py (2)
39-70: Consider adding Google-style docstring for__init__.Per coding guidelines, Google-style docstrings should be used. The
__init__method lacks documentation for its parameters.📝 Suggested docstring
def __init__( self, ignore_words: list[str] | None = None, enabled: bool = True, case_sensitive: bool = False, ) -> None: """Initialize the InterruptionFilter. Args: ignore_words: Custom list of words to ignore. If None, loads from LIVEKIT_INTERRUPTION_IGNORE_WORDS env var or uses DEFAULT_IGNORE_WORDS. enabled: Whether the filter is active. Defaults to True. case_sensitive: Whether word matching is case-sensitive. Defaults to False. """
103-108: Consider using a more robust punctuation stripping approach.The current approach only removes
.,,,!, and?. Other punctuation like;,:,',", etc., won't be stripped, which could cause mismatches (e.g.,"yeah;"wouldn't match"yeah").♻️ Use string.punctuation for comprehensive stripping
+import string + def _is_backchanneling(self, text: str) -> bool: if not text or not text.strip(): return False normalized_text = text.strip() if not self._case_sensitive: normalized_text = normalized_text.lower() - normalized_text = ( - normalized_text.replace(".", "") - .replace(",", "") - .replace("!", "") - .replace("?", "") - ) + # Remove all punctuation + normalized_text = normalized_text.translate( + str.maketrans("", "", string.punctuation) + )
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
FILTER_README.mdexamples/voice_agents/basic_agent.pylivekit-agents/livekit/agents/voice/__init__.pylivekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/filter.pyproof.mdtest.pytests/filter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
test.pylivekit-agents/livekit/agents/voice/agent_session.pytests/filter.pylivekit-agents/livekit/agents/voice/filter.pylivekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/voice/__init__.pyexamples/voice_agents/basic_agent.py
🪛 GitHub Actions: tests
livekit-agents/livekit/agents/voice/agent_activity.py
[error] 68-68: ModuleNotFoundError: No module named 'livekit.agents.voice.interruption_filter'
🪛 GitHub Check: ruff
test.py
[failure] 1-1: Ruff (F401)
test.py:1:8: F401 os imported but unused
🪛 LanguageTool
FILTER_README.md
[grammar] ~111-~111: Ensure spelling is correct
Context: ... Performance The filter adds less than 1ms of latency. It just does simple string ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🪛 markdownlint-cli2 (0.18.1)
proof.md
14-14: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
15-15: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
16-16: Unordered list indentation
Expected: 0; Actual: 2
(MD007, ul-indent)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: type-check (3.13)
- GitHub Check: type-check (3.9)
🔇 Additional comments (14)
livekit-agents/livekit/agents/voice/filter.py (1)
39-70: LGTM!The initialization logic is well-structured with proper priority ordering (parameter → environment variable → default) and consistent case normalization.
test.py (1)
47-60: Tests will fail due to implementation bug in filter.py.These tests (
test_real_interruptionsandtest_mixed_input) expect non-backchannel words to returnFalse, but due to the bug on line 116 infilter.py, all inputs will returnTruewhen the agent is speaking. Once the bug infilter.pyis fixed, these tests should pass.examples/voice_agents/basic_agent.py (1)
105-105: LGTM!The new
interruption_filter_enabled=Trueoption is correctly added to the session configuration alongside related interruption handling options.proof.md (1)
42-48: Test results may be inconsistent with current implementation.The documented test results claim Test 3 (real interruption "stop") returns
False, but the current implementation bug on line 116 infilter.pywould cause this to returnTrueinstead. Please re-run tests after fixing the implementation bug to ensure results are accurate.livekit-agents/livekit/agents/voice/agent_activity.py (2)
138-141: LGTM!The InterruptionFilter is correctly instantiated using session options for configuration, allowing per-session customization of ignore words and enable state.
1188-1202: LGTM!The integration logic correctly determines
agent_is_speakingstate and uses the filter to decide whether to ignore backchanneling interruptions. The debug logging provides good observability.livekit-agents/livekit/agents/voice/agent_session.py (2)
93-94: LGTM!The new fields
interruption_filter_enabledandinterruption_ignore_wordsare correctly added to theAgentSessionOptionsdataclass, following the existing pattern and using proper Python 3.9+ type annotations.
299-300: LGTM!The new interruption filter options are correctly propagated to
AgentSessionOptions, following the established pattern for other session options.livekit-agents/livekit/agents/voice/__init__.py (3)
28-66: LGTM!The reorganization of
__all__with categorical comments improves readability and maintainability.InterruptionFilteris correctly added to the public exports.
72-75: LGTM!Good improvements: adding the type annotation for
__pdoc__and using a more descriptive loop variable namenameinstead ofn.
19-19: Verify the module path for InterruptionFilter import.The import on line 19 references
.interruption_filter, but ensure this module file actually exists in the voice directory. If the implementation is instead infilter.py, the import statement should be corrected tofrom .filter import InterruptionFilter.tests/filter.py (3)
82-93: LGTM!The
should_ignore_interruptionmethod has clear, correct logic: it only filters when enabled AND the agent is speaking AND the text is backchanneling. The early returns make the flow easy to follow.
126-141: LGTM!The mutation methods are well-implemented:
add_ignore_wordcorrectly normalizes before addingremove_ignore_wordusesdiscard()which safely handles missing wordsset_enabledincludes appropriate logging
143-149: LGTM!The
__repr__method provides a clear, informative representation useful for debugging and logging.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| # Default words that should be ignored when agent is speaking | ||
| DEFAULT_IGNORE_WORDS: set[str] = { | ||
| "yeah", | ||
| "ok", | ||
| "okay", | ||
| "hmm", | ||
| "mhm", | ||
| "mm-hmm", | ||
| "uh-huh", | ||
| "right", | ||
| "aha", | ||
| "ah", | ||
| "oh", | ||
| "sure", | ||
| "yep", | ||
| "yup", | ||
| "gotcha", | ||
| "got it", | ||
| "alright", | ||
| "cool", | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Multi-word phrases in DEFAULT_IGNORE_WORDS won't match correctly.
The phrase "got it" (line 34) won't be recognized because _is_backchanneling splits the input text on whitespace and checks each word individually. When a user says "got it", it becomes ["got", "it"], and neither word exists in the ignore set (only "got it" as a single entry does).
🐛 Consider one of these fixes
Option 1: Remove multi-word phrases and add individual words
- "got it",
+ "got",
+ "it",Option 2: Add logic to check for multi-word phrases
def _is_backchanneling(self, text: str) -> bool:
if not text or not text.strip():
return False
normalized_text = text.strip()
if not self._case_sensitive:
normalized_text = normalized_text.lower()
normalized_text = (
normalized_text.replace(".", "")
.replace(",", "")
.replace("!", "")
.replace("?", "")
)
+ # Check if entire phrase matches first
+ if normalized_text in self._ignore_words:
+ logger.debug("Detected backchanneling phrase: '%s' - ignoring interruption", text)
+ return True
+
words = normalized_text.split()
if not words:
return FalseAlso applies to: 95-124
🤖 Prompt for AI Agents
In `@tests/filter.py` around lines 17 - 37, DEFAULT_IGNORE_WORDS contains
multi-word entries (e.g., "got it") that _is_backchanneling currently won't
match because it splits input into individual tokens; update either the data or
the matching logic so phrases are detected. Two concise fixes: (1) Replace
multi-word entries in DEFAULT_IGNORE_WORDS (and the similar set referenced
later) with their component words (e.g., "got", "it") so single-token checks in
_is_backchanneling work; or (2) enhance _is_backchanneling to also check for
multi-word phrases by constructing n-grams from the tokenized input and
comparing joined n-grams against DEFAULT_IGNORE_WORDS (ensure this logic is used
wherever _is_backchanneling is called).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/voice/agent_activity.py`:
- Line 68: The relative import order is wrong: move the "from .filter import
InterruptionFilter" line so it is alphabetically placed among the other relative
imports (i.e., after the ".events" import and before the ".generation" import)
to satisfy import-sorting CI checks; update the import block in
agent_activity.py accordingly so InterruptionFilter is between .events and
.generation.
♻️ Duplicate comments (1)
livekit-agents/livekit/agents/voice/filter.py (1)
1-6: Critical: Missingimport stringcauses runtime error.The
stringmodule is used at line 104 (string.punctuation) but is not imported. This will cause aNameErrorwhen_is_backchanneling()is called, breaking the filter functionality.🐛 Proposed fix
from __future__ import annotations import logging import os +import string logger = logging.getLogger(__name__)
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
FILTER_README.mdlivekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/voice/filter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/voice/filter.py
🪛 GitHub Actions: CI
livekit-agents/livekit/agents/voice/agent_activity.py
[error] 1-1: Ruff check failed. Import block is un-sorted or un-formatted. Command: 'uv run ruff check --output-format=github .'
livekit-agents/livekit/agents/voice/filter.py
[error] 104-104: Name 'string' is not defined [name-defined]
🪛 GitHub Check: ruff
livekit-agents/livekit/agents/voice/filter.py
[failure] 104-104: Ruff (F821)
livekit-agents/livekit/agents/voice/filter.py:104:35: F821 Undefined name string
🪛 LanguageTool
FILTER_README.md
[grammar] ~111-~111: Ensure spelling is correct
Context: ... Performance The filter adds less than 1ms of latency. It just does simple string ...
(QB_NEW_EN_ORTHOGRAPHY_ERROR_IDS_1)
🔇 Additional comments (7)
FILTER_README.md (1)
1-111: Documentation looks good.The README clearly explains the problem, solution, usage patterns, and implementation details. The file reference on line 101 now correctly points to
filter.py.livekit-agents/livekit/agents/voice/agent_activity.py (2)
138-141: Filter initialization looks correct.The
InterruptionFilteris properly initialized with configuration from session options, following the established pattern for other session-level configurations.
1176-1202: Interruption filter integration is well-implemented.The logic correctly:
- Extracts the transcript from audio recognition
- Determines agent speaking state from current speech handle
- Uses the filter to decide whether to ignore backchanneling
- Returns early with debug logging when ignoring
The integration preserves existing behavior when the filter doesn't apply.
livekit-agents/livekit/agents/voice/filter.py (4)
9-37: Well-designed default word set.The
DEFAULT_IGNORE_WORDSset covers common English backchanneling expressions including multi-word phrases. The class docstring clearly explains the intended behavior.
39-70: Initialization logic is well-structured.The configuration loading follows a sensible priority order (parameter → environment variable → defaults), with proper whitespace handling for the environment variable parsing and consistent case normalization.
95-126: Backchanneling detection logic is correct.The method properly:
- Handles empty/whitespace input
- Normalizes case and removes punctuation
- Checks both full phrases and individual words against the ignore list
- Returns appropriate results with debug logging
The past issue about unreachable code has been addressed.
128-151: Helper methods are well-implemented.The
add_ignore_word,remove_ignore_word, andset_enabledmethods provide a clean API for runtime configuration, with consistent case normalization and appropriate logging.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Added parameters for interruption filtering in agent session.
Removed unused import of the 'os' module.
Refactor InterruptionFilter to improve clarity and functionality.
Split 'got it' into separate entries 'got' and 'it'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/voice/filter.py`:
- Around line 105-133: The code normalizes input text but not the stored
backchannel tokens, so hyphenated ignore words like "mm-hmm" don't match; update
the ignore-word handling by applying the same normalization pipeline (strip(),
lower() when !_case_sensitive, and translate with str.maketrans to remove
string.punctuation) to the entries in self._ignore_words (or build a new
internal set like self._normalized_ignore) and then compare
normalized_phrase/words against that normalized set in the function that
contains normalized_text, normalized_phrase and words; ensure this normalization
happens once (e.g., during initialization) instead of on every call.
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
livekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/filter.pytest.py
🚧 Files skipped from review as they are similar to previous changes (1)
- test.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/filter.pylivekit-agents/livekit/agents/voice/agent_activity.py
🧬 Code graph analysis (1)
livekit-agents/livekit/agents/voice/agent_activity.py (2)
livekit-agents/livekit/agents/voice/audio_recognition.py (1)
current_transcript(323-329)livekit-agents/livekit/agents/tokenize/_basic_word.py (1)
split_words(6-70)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: type-check (3.13)
- GitHub Check: type-check (3.9)
🔇 Additional comments (8)
livekit-agents/livekit/agents/voice/filter.py (3)
18-79: LGTM — defaults + env override are clear.
81-99: LGTM — properties and gating logic are straightforward.
135-158: LGTM — mutators and repr are clean.livekit-agents/livekit/agents/voice/agent_activity.py (3)
55-55: LGTM — clear dependency wiring.
136-141: LGTM — filter initialized from session options as expected.
1176-1202: LGTM — interruption filtering is well integrated.livekit-agents/livekit/agents/voice/agent_session.py (2)
76-94: LGTM — session options expose filter config cleanly.
137-309: LGTM — constructor docs and option propagation are consistent.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
Refactor interruption filtering logic to normalize ignore words and improve handling of backchanneling phrases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/voice/filter.py`:
- Around line 92-95: Replace the deprecated typing annotations for the
translator variable with modern built-ins: change the annotation from Dict[int,
Optional[int]] to dict[int, int | None] (e.g., translator: dict[int, int | None]
= str.maketrans(...)), and remove any now-unused imports like Dict and Optional
from typing; keep the use of str.maketrans and normalized.translate as-is.
- Line 6: Remove the deprecated typing imports: delete "from typing import Dict,
Optional" and update any type annotations in this module that use Dict[...] and
Optional[...] to use built-in generics and PEP 563/annotations-friendly syntax
(e.g., dict[...] and T | None). Search for uses of "Dict" and "Optional" in
filter.py and replace them accordingly (e.g., Dict[str, int] -> dict[str, int],
Optional[Foo] -> Foo | None) so ruff no longer flags the import as unused.
🧹 Nitpick comments (1)
livekit-agents/livekit/agents/voice/filter.py (1)
107-118: Consider adding a docstring for public API method.This is a primary public method of the class. Adding a Google-style docstring would improve API documentation.
📝 Suggested docstring
def should_ignore_interruption( self, transcribed_text: str, agent_is_speaking: bool, ) -> bool: + """Determine if an interruption should be ignored. + + Args: + transcribed_text: The text transcribed from user speech. + agent_is_speaking: Whether the agent is currently speaking. + + Returns: + True if the interruption should be ignored (backchanneling while + agent speaks), False otherwise. + """ if not self._enabled:
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
livekit-agents/livekit/agents/voice/filter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/voice/filter.py
🪛 GitHub Check: ruff
livekit-agents/livekit/agents/voice/filter.py
[failure] 92-92: Ruff (UP007)
livekit-agents/livekit/agents/voice/filter.py:92:31: UP007 Use X | Y for type annotations
[failure] 92-92: Ruff (UP006)
livekit-agents/livekit/agents/voice/filter.py:92:21: UP006 Use dict instead of Dict for type annotation
[failure] 6-6: Ruff (UP035)
livekit-agents/livekit/agents/voice/filter.py:6:1: UP035 typing.Dict is deprecated, use dict instead
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
🔇 Additional comments (7)
livekit-agents/livekit/agents/voice/filter.py (7)
11-40: LGTM!The class docstring follows Google style, and the default ignore words set is well-curated for common backchanneling utterances.
42-84: LGTM!The initialization logic properly handles the fallback chain (parameter → environment variable → defaults), and ignore words are normalized once at construction time. The fix for hyphenated word normalization from past reviews has been properly implemented.
97-105: LGTM!Properties are correctly implemented, and returning a copy of
_ignore_wordsproperly encapsulates internal state.
120-151: LGTM!The backchanneling detection logic is now correctly implemented with proper control flow. The phrase-level check (line 133) returns early only on match, and the word-by-word validation (lines 141-148) is now reachable. The critical bug from the previous review has been properly addressed.
153-159: LGTM!Both methods correctly normalize words before adding/removing, and using
discard()for removal avoidsKeyErrorif the word isn't present.
161-166: LGTM!Simple and correct implementation with appropriate INFO-level logging for state changes.
168-174: LGTM!Good
__repr__implementation for debugging, showing the count rather than the full set to avoid verbose output.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@examples/voice_agents/session_close_callback.py`:
- Around line 80-83: Guard against item.text_content being None before calling
.replace(): change the assignment that builds content and text (the lines using
item.text_content.replace(...) and text = f"{item.role}: {content}") to use a
safe fallback (e.g., treat None as an empty string or skip non-text items) so
.replace is only called on a str; update the logic in the session close handler
where item.text_content, content, and text are constructed to first coalesce
item.text_content (or check for None) and then call .replace on the guaranteed
string.
🧹 Nitpick comments (2)
examples/voice_agents/llamaindex-rag/retrieval.py (1)
83-86: Improve the comment and fix extra blank line.The comment
# New codeprovides no useful context—consider removing it or replacing it with something descriptive like# Debug: log truncated instructions. Also, there's a double blank line here (line 86 followed by line 87) which violates PEP 8 (E303).Suggested fix
- # New code - debug_text = instructions[:100].replace('\n', '\\n') - print(f"update instructions: {debug_text}...") - + # Debug: log truncated instructions + debug_text = instructions[:100].replace('\n', '\\n') + print(f"update instructions: {debug_text}...")pyproject.toml (1)
145-148: Consider using[tool.ruff.lint.per-file-ignores]for consistency with other lint configuration sections.The project already structures lint-specific settings under
[tool.ruff.lint](isort, pydocstyle), so moving per-file-ignores to[tool.ruff.lint.per-file-ignores]would align the configuration style. Both locations are currently valid and supported.Suggested fix
-[tool.ruff.per-file-ignores] +[tool.ruff.lint.per-file-ignores] "examples/**" = ["UP045", "UP007", "F541"] "livekit-agents/livekit/agents/cli/**" = ["UP045", "UP007"]
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
examples/voice_agents/llamaindex-rag/retrieval.pyexamples/voice_agents/session_close_callback.pypyproject.toml
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
examples/voice_agents/session_close_callback.pyexamples/voice_agents/llamaindex-rag/retrieval.py
🧠 Learnings (3)
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Applies to **/*.py : Run ruff linter and auto-fix issues
Applied to files:
pyproject.toml
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Applies to **/*.py : Format code with ruff
Applied to files:
pyproject.toml
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Applies to **/*.py : Run mypy type checker in strict mode
Applied to files:
pyproject.toml
🧬 Code graph analysis (2)
examples/voice_agents/session_close_callback.py (1)
livekit-agents/livekit/agents/llm/chat_context.py (1)
text_content(164-173)
examples/voice_agents/llamaindex-rag/retrieval.py (1)
livekit-agents/livekit/agents/voice/agent.py (1)
instructions(99-104)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
- GitHub Check: unit-tests
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| # New code | ||
| content = item.text_content.replace('\n', '\\n') | ||
| text = f"{item.role}: {content}" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
# First, examine the file at the mentioned lines
cat -n examples/voice_agents/session_close_callback.py | sed -n '75,90p'Repository: livekit/agents
Length of output: 775
🏁 Script executed:
# Find the definition of text_content to verify if it can return None
rg -A 5 "def text_content|@property" --type py | grep -A 5 "text_content"Repository: livekit/agents
Length of output: 550
🏁 Script executed:
# Search for the type hints or documentation about text_content
rg "text_content" --type py -B 2 -A 2 | head -60Repository: livekit/agents
Length of output: 5653
🏁 Script executed:
# Get the full implementation of text_content
rg -A 15 "def text_content\(self\)" livekit-agents/livekit/agents/llm/chat_context.pyRepository: livekit/agents
Length of output: 506
Guard against None from item.text_content before calling .replace().
The text_content property returns str | None (returns None when the message has no text parts). Calling .replace() directly will raise AttributeError and crash the close handler. Use a safe fallback.
🔧 Proposed fix
- content = item.text_content.replace('\n', '\\n')
+ raw_content = item.text_content
+ content = (raw_content or "").replace('\n', '\\n')📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # New code | |
| content = item.text_content.replace('\n', '\\n') | |
| text = f"{item.role}: {content}" | |
| # New code | |
| raw_content = item.text_content | |
| content = (raw_content or "").replace('\n', '\\n') | |
| text = f"{item.role}: {content}" |
🤖 Prompt for AI Agents
In `@examples/voice_agents/session_close_callback.py` around lines 80 - 83, Guard
against item.text_content being None before calling .replace(): change the
assignment that builds content and text (the lines using
item.text_content.replace(...) and text = f"{item.role}: {content}") to use a
safe fallback (e.g., treat None as an empty string or skip non-text items) so
.replace is only called on a str; update the logic in the session close handler
where item.text_content, content, and text are constructed to first coalesce
item.text_content (or check for None) and then call .replace on the guaranteed
string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/cli/cli.py (1)
1491-1502: UseOptional[str]/Optional[int]instead of PEP 604 unions in Typer parameters for Python 3.9 compatibility.With
from __future__ import annotations, all annotations are stringified. When Typer evaluates these at runtime viaget_type_hints(), it callseval()on the strings. On Python 3.9, evaluating"str | None"raisesTypeError(the union operator|is not supported at runtime until Python 3.10). Switch toOptional[str]/Optional[int], which can be safely evaluated on Python 3.9 (keep# noqa: UP007if ruff nags).Suggested fix (apply to all affected CLI params)
- input_device: Annotated[ - str | None, # noqa: UP007, required for python 3.9 + input_device: Annotated[ + Optional[str], # noqa: UP007 typer.Option( help="Numeric input device ID or input device name substring(s)", ), ] = None, output_device: Annotated[ - str | None, # noqa: UP007 + Optional[str], # noqa: UP007 typer.Option( help="Numeric output device ID or output device name substring(s)", ), ] = None,Applies to lines 1491–1502, 1543–1566, 1595–1611, 1676–1702.
🧹 Nitpick comments (1)
examples/bank-ivr/mock_bank_service.py (1)
191-195: Remove unnecessary# noqa: UP007directive.The
UP007rule flags usage ofOptional[X]orUnion[X, Y]and suggests usingX | Yinstead. Since the code already uses the modernDepositAccount | Nonesyntax, this directive suppresses nothing. Removing it also brings the line within the 100-character limit per coding guidelines.Proposed fix
- def find_deposit_account(self, customer_id: str, account_number: str) -> DepositAccount | None: # noqa: UP007 + def find_deposit_account(self, customer_id: str, account_number: str) -> DepositAccount | None:
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
examples/bank-ivr/ivr_system_agent.pyexamples/bank-ivr/mock_bank_service.pylivekit-agents/livekit/agents/cli/cli.pylivekit-agents/livekit/agents/voice/ivr/ivr_activity.pytests/filter.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/filter.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
examples/bank-ivr/ivr_system_agent.pylivekit-agents/livekit/agents/voice/ivr/ivr_activity.pyexamples/bank-ivr/mock_bank_service.pylivekit-agents/livekit/agents/cli/cli.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: type-check (3.13)
- GitHub Check: type-check (3.9)
- GitHub Check: unit-tests
🔇 Additional comments (3)
examples/bank-ivr/mock_bank_service.py (1)
10-10: LGTM!Clean removal of the unused
Optionalimport after migrating to PEP 604 union syntax.livekit-agents/livekit/agents/voice/ivr/ivr_activity.py (1)
3-3: No changes needed—code is compatible with Python 3.9.The file includes
from __future__ import annotationsat line 1, which stringifies all annotations per PEP 563. This allows PEP 604 syntax (str | None) to work correctly in Python 3.9+, as the union syntax is never evaluated at runtime. The# noqa: UP007comments indicate this is a deliberate choice using modern annotation syntax. The codebase consistently applies this pattern across multiple files, and no reversion toOptional[str]is required.examples/bank-ivr/ivr_system_agent.py (1)
55-57: Examples directory is exempted from the Python 3.10 union syntax rule; verify intent.The code uses
str | None(Python 3.10+ syntax) in a project withrequires-python = ">=3.9,<3.14", which would cause a SyntaxError on Python 3.9. However, theexamples/**directory is explicitly exempted from the UP007 rule in ruff's per-file-ignores, and examples are not run in tests. If this exemption is intentional to showcase newer syntax patterns, the code can remain as-is; otherwise, revert toOptional[str]for 3.9 compatibility or update the project's minimum Python version requirement to 3.10+.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
examples/voice_agents/llamaindex-rag/retrieval.py (1)
44-48: Typo: "unpronouncable" → "unpronounceable".📝 Suggested fix
instructions=( "You are a voice assistant created by LiveKit. Your interface " "with users will be voice. You should use short and concise " - "responses, and avoiding usage of unpronouncable punctuation." + "responses, and avoiding usage of unpronounceable punctuation." ),
🧹 Nitpick comments (1)
examples/voice_agents/llamaindex-rag/retrieval.py (1)
83-85: LGTM — newline escaping improves debug output readability.The explicit
"Debug:"comment and newline escaping make the output clearer. For production-quality examples, consider usinglogging.debug()instead ofprint(), but this is acceptable for demo code.♻️ Optional: Use logging instead of print
+import logging + +logger = logging.getLogger(__name__) + # Debug: log truncated instructions debug_text = instructions[:100].replace("\n", "\\n") -print(f"update instructions: {debug_text}...") +logger.debug(f"update instructions: {debug_text}...")
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
examples/voice_agents/llamaindex-rag/retrieval.pyexamples/voice_agents/session_close_callback.pypyproject.toml
🚧 Files skipped from review as they are similar to previous changes (2)
- pyproject.toml
- examples/voice_agents/session_close_callback.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
examples/voice_agents/llamaindex-rag/retrieval.py
🧬 Code graph analysis (1)
examples/voice_agents/llamaindex-rag/retrieval.py (1)
livekit-agents/livekit/agents/voice/agent.py (1)
instructions(99-104)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: type-check (3.9)
- GitHub Check: unit-tests
- GitHub Check: type-check (3.13)
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
|
All CI checks are now passing. This PR adds intelligent interruption handling to prevent the agent Happy to address any feedback. Thanks! |
feat: Add intelligent interruption handling
Fixes an issue where the agent stops speaking when the user says simple listening words like “yeah” or “ok” during agent speech.
Changes
Added an InterruptionFilter to ignore backchanneling words
Integrated the filter into the agent’s interruption logic
Added configuration options to:
Enable or disable the filter
Customize the list of ignored words
All four test cases are passing
How it works
When the user speaks while the agent is talking, the filter checks:
Whether the agent is currently speaking
Whether the user’s input contains only backchanneling words
If both conditions are true, the interruption is ignored and the agent continues speaking.
Otherwise, the agent is interrupted as usual.
Summary by CodeRabbit
New Features
Configuration
Documentation
Tests
Public API
✏️ Tip: You can customize this high-level summary in your review settings.