-
Notifications
You must be signed in to change notification settings - Fork 2.7k
AGT-2415: Collect internal events for testing #4505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
67280ea to
08ffd59
Compare
📝 WalkthroughWalkthroughAdds discriminated internal-event types and instrumentation across voice pipeline components; AgentSession and RunResult are wired to optionally record internal events (via include_internal_events); SessionReport is extended to include, timestamp, and serialize collected internal events. Changes
Sequence Diagram(s)sequenceDiagram
participant Pipeline as Pipeline (STT / LLM / TTS)
participant Activity as AgentActivity
participant Session as AgentSession
participant Run as RunResult
participant Report as SessionReport
Pipeline->>Activity: emit events (chunks, frames, transcripts)
Activity->>Session: maybe_collect(event)
alt include_internal_events == true
Session->>Session: append to _recorded_internal_events
end
Pipeline->>Session: playback_started / playback_finished (listener)
Session->>Session: attach/detach listeners, maybe_collect(playback_event)
Run->>Session: record RunEvent (via RunResult._record_event)
Session->>Report: make_session_report(include_internal_events, internal_events, timestamps)
Report->>Report: serialize events (AudioFrame -> base64, TimedString -> dict, LLM payloads)
Report-->>Caller: serialized SessionReport
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/voice/agent_session.py (1)
539-541: Reset_recorded_internal_eventson session start.
_recorded_eventsis cleared, but internal events aren’t, so a restarted session can leak prior events into the report.🔧 Proposed fix
self._recorded_events = [] + self._recorded_internal_events = [] self._room_io = None self._recorder_io = None
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/llm/__init__.py`:
- Around line 100-103: Export list includes RealtimeSessionRestoredEvent in
__all__ but the class is not imported or defined; either remove
"RealtimeSessionRestoredEvent" from the __all__ list or add/implement and import
the RealtimeSessionRestoredEvent class into this module (ensure the symbol name
matches exactly) so that the exported symbol resolves when accessed from the
package; update the __all__ array accordingly and ensure any new class is
imported alongside the other events (e.g., with RealtimeSessionReconnectedEvent,
LLMError, LLMOutputEvent) so consumers can import it without AttributeError.
In `@livekit-agents/livekit/agents/types.py`:
- Around line 119-126: The to_dict method returns sentinel NotGiven values which
break JSON serialization; update the to_dict implementation on the class with
the to_dict method to convert any NotGiven sentinel to None for each field
("text"/self, start_time, end_time, confidence, start_time_offset) before
returning the dict—import the NotGiven sentinel (or compare against it) and map
values equal to NotGiven to None so report.to_dict() yields JSON-serializable
primitives.
In `@livekit-agents/livekit/agents/voice/report.py`:
- Around line 71-78: The code scrubs audio by converting a VADEvent to a dict
via asdict(e) and then replaces the frames with an empty dict which changes the
type; change the scrub to set data["frames"] = [] so VADEvent.frames remains a
list (locate the block handling isinstance(e, VADEvent) where asdict(e) is used
and append to internal_events_dict).
🧹 Nitpick comments (4)
livekit-agents/livekit/agents/llm/llm.py (1)
75-83: Discriminant-data type correspondence is not enforced.The
typefield indicates the expected data type, but nothing prevents constructing anLLMOutputEventwith mismatchedtypeanddatavalues (e.g.,type="llm_chunk_output"withdata=str). This pattern relies on caller discipline.Consider using a factory method or overloaded constructors to ensure correctness, or document the expected correspondence clearly.
Example: Factory methods for type safety
`@dataclass` class LLMOutputEvent: type: Literal[ "llm_chunk_output", "llm_str_output", "llm_timed_string_output", "realtime_audio_output", ] data: ChatChunk | str | TimedString | rtc.AudioFrame `@classmethod` def from_chunk(cls, chunk: ChatChunk) -> "LLMOutputEvent": return cls(type="llm_chunk_output", data=chunk) `@classmethod` def from_str(cls, s: str) -> "LLMOutputEvent": return cls(type="llm_str_output", data=s) `@classmethod` def from_timed_string(cls, ts: TimedString) -> "LLMOutputEvent": return cls(type="llm_timed_string_output", data=ts) `@classmethod` def from_audio_frame(cls, frame: rtc.AudioFrame) -> "LLMOutputEvent": return cls(type="realtime_audio_output", data=frame)livekit-agents/livekit/agents/voice/agent.py (1)
392-395: Collect-before-yield to avoid missing events on early cancellation.Currently,
maybe_collect(...)runs afteryield, so if the consumer cancels/short-circuits, the last item can be dropped from internal events. Consider collecting before yield to make capture resilient to early exits.♻️ Suggested pattern (apply to each generator)
- yield event - activity.session.maybe_collect(event) + activity.session.maybe_collect(event) + yield eventAlso applies to: 419-423, 452-455, 463-473, 485-489
livekit-agents/livekit/agents/voice/report.py (1)
108-115: Add a Google-style docstring for_serialize_audio_frame.As per coding guidelines, add Google-style docstrings for new helpers.✍️ Docstring example
`@staticmethod` def _serialize_audio_frame(frame: AudioFrame) -> dict: + """Serialize an AudioFrame to a JSON-friendly dict. + + Args: + frame: The audio frame to serialize. + + Returns: + A JSON-serializable dict with audio metadata and base64 data. + """ return { "sample_rate": frame.sample_rate, "num_channels": frame.num_channels, "samples_per_channel": frame.samples_per_channel, "data": base64.b64encode(frame.data).decode("utf-8"), }livekit-agents/livekit/agents/voice/agent_session.py (1)
370-373: Avoid storing internal events when the flag is disabled.Right now every emitted AgentEvent is appended to
_recorded_internal_events, even wheninclude_internal_eventsis off, which can double memory usage on long sessions.🔧 Suggested guard
def emit(self, event: EventTypes, arg: AgentEvent) -> None: # type: ignore self._recorded_events.append(arg) - self._recorded_internal_events.append(arg) + if self._include_internal_events: + self._recorded_internal_events.append(arg) super().emit(event, arg)
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (15)
livekit-agents/livekit/agents/job.pylivekit-agents/livekit/agents/llm/__init__.pylivekit-agents/livekit/agents/llm/llm.pylivekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/tts/tts.pylivekit-agents/livekit/agents/types.pylivekit-agents/livekit/agents/voice/agent.pylivekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/events.pylivekit-agents/livekit/agents/voice/io.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/voice/room_io/room_io.pylivekit-agents/livekit/agents/voice/room_io/types.pylivekit-agents/livekit/agents/voice/run_result.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/tts/tts.pylivekit-agents/livekit/agents/voice/run_result.pylivekit-agents/livekit/agents/voice/io.pylivekit-agents/livekit/agents/job.pylivekit-agents/livekit/agents/llm/llm.pylivekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/llm/__init__.pylivekit-agents/livekit/agents/voice/room_io/room_io.pylivekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/voice/agent.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/voice/room_io/types.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/types.pylivekit-agents/livekit/agents/voice/events.py
🧠 Learnings (2)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.
Applied to files:
livekit-agents/livekit/agents/tts/tts.pylivekit-agents/livekit/agents/voice/run_result.pylivekit-agents/livekit/agents/voice/io.pylivekit-agents/livekit/agents/job.pylivekit-agents/livekit/agents/llm/llm.pylivekit-agents/livekit/agents/voice/agent_activity.pylivekit-agents/livekit/agents/llm/__init__.pylivekit-agents/livekit/agents/voice/room_io/room_io.pylivekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/voice/agent.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/voice/room_io/types.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/types.pylivekit-agents/livekit/agents/voice/events.py
📚 Learning: 2026-01-16T07:44:56.353Z
Learnt from: CR
Repo: livekit/agents PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-01-16T07:44:56.353Z
Learning: Implement Model Interface Pattern for STT, TTS, LLM, and Realtime models with provider-agnostic interfaces, fallback adapters for resilience, and stream adapters for different streaming patterns
Applied to files:
livekit-agents/livekit/agents/voice/agent.py
🧬 Code graph analysis (9)
livekit-agents/livekit/agents/voice/run_result.py (2)
livekit-agents/livekit/agents/voice/agent_session.py (1)
maybe_collect(375-378)livekit-agents/livekit/agents/llm/chat_context.py (1)
insert(269-275)
livekit-agents/livekit/agents/llm/llm.py (1)
livekit-agents/livekit/agents/types.py (1)
TimedString(95-126)
livekit-agents/livekit/agents/voice/agent_activity.py (2)
livekit-agents/livekit/agents/voice/agent_session.py (2)
llm(1284-1285)maybe_collect(375-378)livekit-agents/livekit/agents/llm/realtime.py (1)
InputSpeechStartedEvent(20-21)
livekit-agents/livekit/agents/voice/room_io/room_io.py (3)
livekit-agents/livekit/agents/voice/room_io/types.py (1)
TextInputEvent(32-36)livekit-agents/livekit/agents/llm/tool_context.py (1)
info(142-143)livekit-agents/livekit/agents/voice/agent_session.py (1)
maybe_collect(375-378)
livekit-agents/livekit/agents/voice/agent.py (2)
livekit-agents/livekit/agents/voice/agent_activity.py (3)
llm(2794-2798)session(237-238)agent(241-242)livekit-agents/livekit/agents/types.py (1)
TimedString(95-126)
livekit-agents/livekit/agents/voice/report.py (7)
livekit-agents/livekit/agents/voice/agent.py (3)
llm(538-548)tts(551-561)vad(577-587)livekit-agents/livekit/agents/llm/llm.py (2)
ChatChunk(69-72)LLMOutputEvent(76-83)livekit-agents/livekit/agents/tts/tts.py (2)
SynthesizedAudio(33-44)sample_rate(118-119)livekit-agents/livekit/agents/types.py (2)
TimedString(95-126)to_dict(119-126)livekit-agents/livekit/agents/vad.py (2)
VADEvent(26-68)VADEventType(19-22)livekit-agents/livekit/agents/voice/room_io/room_io.py (1)
room(197-198)livekit-agents/livekit/agents/voice/io.py (1)
sample_rate(243-245)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/voice/io.py (5)
AudioOutput(142-286)audio(431-432)audio(435-449)audio(555-556)audio(559-573)livekit-agents/livekit/agents/voice/run_result.py (7)
event(584-585)event(864-865)event(1001-1002)event(1011-1012)event(1021-1022)RunResult(71-229)done(141-143)
livekit-agents/livekit/agents/types.py (3)
livekit-agents/livekit/agents/voice/report.py (1)
to_dict(40-106)livekit-agents/livekit/agents/llm/chat_context.py (1)
to_dict(402-441)livekit-agents/livekit/agents/stt/stt.py (2)
start_time_offset(295-296)start_time_offset(299-302)
livekit-agents/livekit/agents/voice/events.py (5)
livekit-agents/livekit/agents/llm/realtime.py (4)
GenerationCreatedEvent(40-47)InputSpeechStartedEvent(20-21)InputSpeechStoppedEvent(25-27)InputTranscriptionCompleted(127-133)livekit-agents/livekit/agents/llm/llm.py (1)
LLMOutputEvent(76-83)livekit-agents/livekit/agents/types.py (1)
FlushSentinel(38-39)livekit-agents/livekit/agents/voice/io.py (2)
PlaybackFinishedEvent(119-127)PlaybackStartedEvent(131-134)livekit-agents/livekit/agents/voice/room_io/types.py (1)
TextInputEvent(32-36)
🔇 Additional comments (21)
livekit-agents/livekit/agents/voice/room_io/types.py (1)
31-36: LGTM!The discriminant
typefield with a default value follows the pattern established for theInternalEventdiscriminated union across the PR. Placing it as the last field avoids breaking positional argument compatibility.livekit-agents/livekit/agents/tts/tts.py (1)
32-44: LGTM!The discriminant field follows the established pattern and is correctly positioned as the last field with a default value, preserving backward compatibility for existing callers.
livekit-agents/livekit/agents/llm/__init__.py (1)
23-24: LGTM!
LLMOutputEventis properly imported and will be correctly exported through__all__.livekit-agents/livekit/agents/types.py (1)
37-39: LGTM!Simple sentinel dataclass with discriminant field, consistent with the PR's pattern.
livekit-agents/livekit/agents/llm/realtime.py (5)
19-21: LGTM!Discriminant field correctly added to
InputSpeechStartedEvent.
24-28: LGTM!Discriminant field correctly positioned after the required field.
30-36: LGTM!Discriminant field correctly positioned as the last field in
MessageGeneration.
39-47: LGTM!Discriminant field correctly positioned after the optional
response_idfield inGenerationCreatedEvent.
126-133: LGTM!Discriminant field correctly positioned as the last field in
InputTranscriptionCompleted.livekit-agents/livekit/agents/voice/room_io/room_io.py (1)
410-415: LGTM for text input event capture.The event reuse plus collection hook reads cleanly and matches the internal-event flow.
livekit-agents/livekit/agents/voice/agent_activity.py (1)
1108-1110: Internal-event collection hooks look solid.Consistent placement across handlers; no functional side effects detected.
Also applies to: 1122-1124, 1132-1134, 1145-1147, 1219-1223, 1233-1236, 1250-1252, 1269-1271, 1299-1301
livekit-agents/livekit/agents/voice/io.py (1)
118-128: LGTM for discriminant type fields.This supports internal event typing without impacting runtime behavior.
Also applies to: 130-135
livekit-agents/livekit/agents/job.py (1)
266-278: LGTM for SessionReport internal-events wiring.Looks consistent with the new collection path.
livekit-agents/livekit/agents/voice/run_result.py (1)
71-101: All RunResult instantiations already pass the requiredagent_sessionparameter.Verified instantiations in agent_session.py (lines 446 and 652) both correctly provide
agent_session=self. No runtime errors will occur from missing this parameter.livekit-agents/livekit/agents/voice/events.py (2)
15-34: No review notes for the added event imports.
257-275: InternalEvent union looks consistent with the new event surface.livekit-agents/livekit/agents/voice/agent_session.py (4)
52-52: No review notes for the new InternalEvent import.
362-365: Internal-event flag plumbing looks consistent.Also applies to: 375-379, 463-506, 521-521
446-446: RunResult now wired with AgentSession for event collection.Also applies to: 652-653
328-329: Playback listener attach/detach wiring is consistent.Also applies to: 642-646, 804-807, 1322-1342
livekit-agents/livekit/agents/voice/report.py (1)
20-30: No action needed—all required SessionReport fields are already being passed at the single instantiation site.The only call site in
livekit/agents/job.pyalready providesinclude_internal_eventsandinternal_events, so no defaults are necessary and no compatibility issues exist.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/voice/report.py (1)
34-36:durationandstarted_atfields are not included into_dict()output.The
SessionReportdataclass definesdurationandstarted_atfields (lines 34-36), but these are not serialized in theto_dict()method's return value. If these fields are intended for consumers, they should be added to the output dictionary.🔧 Proposed fix (if intended for serialization)
return { "job_id": self.job_id, "room_id": self.room_id, "room": self.room, "events": events_dict, "internal_events": internal_events_dict, + "duration": self.duration, + "started_at": self.started_at, "audio_recording_path": (Also applies to: 82-106
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/llm/realtime.py`:
- Around line 40-48: GenerationCreatedEvent is defined as a plain dataclass but
uses pydantic Field(..., exclude=True) for message_stream and function_stream
which doesn't prevent dataclasses.asdict() from attempting to deepcopy
AsyncIterable fields and will raise a TypeError; fix by either converting
GenerationCreatedEvent into a pydantic BaseModel (remove dataclass usage and use
BaseModel with Field(..., exclude=True) for message_stream and function_stream)
or keep it as a dataclass and update the serializer in report.py to special-case
GenerationCreatedEvent (do not call dataclasses.asdict() on that instance;
instead build a dict that omits message_stream and function_stream or replace
them with serializable placeholders) so as to avoid deepcopying unpicklable
AsyncIterable fields.
In `@livekit-agents/livekit/agents/types.py`:
- Around line 119-126: The to_dict method currently uses "or None" which turns
valid falsy numeric values (0, 0.0) into None; update to_dict to explicitly
check the sentinel (e.g., NotGiven) for each field (start_time, end_time,
confidence, start_time_offset) and only map to None when the attribute equals
the NotGiven sentinel, otherwise return the actual attribute value; reference
the to_dict method and the attribute names (start_time, end_time, confidence,
start_time_offset) when making the change.
In `@livekit-agents/livekit/agents/voice/agent_session.py`:
- Around line 370-378: The emit method appends to _recorded_internal_events
regardless of _include_internal_events, causing memory growth; update emit(self,
event: EventTypes, arg: AgentEvent) to only append arg to
_recorded_internal_events when self._include_internal_events is True (leave
_recorded_events append and super().emit intact), so internal events are
collected consistently with maybe_collect and the _include_internal_events flag.
♻️ Duplicate comments (1)
livekit-agents/livekit/agents/voice/report.py (1)
50-80:GenerationCreatedEventwill cause serialization failure.The fallback
asdict(e)on line 80 will be invoked forGenerationCreatedEvent, but this event containsAsyncIterablefields (message_stream,function_stream) that aren't serializable. This causes aTypeErrorwhen the report is generated.🔧 Proposed fix
+from ..llm import GenerationCreatedEvent ... for e in self.internal_events: if isinstance(e, BaseModel): internal_events_dict.append(e.model_dump()) + elif isinstance(e, GenerationCreatedEvent): + internal_events_dict.append({ + "type": e.type, + "user_initiated": e.user_initiated, + "response_id": e.response_id, + }) elif isinstance(e, SynthesizedAudio):
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
livekit-agents/livekit/agents/llm/__init__.pylivekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/types.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/events.pylivekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/llm/__init__.pylivekit-agents/livekit/agents/types.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/events.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.
Applied to files:
livekit-agents/livekit/agents/llm/__init__.pylivekit-agents/livekit/agents/types.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/events.py
🧬 Code graph analysis (4)
livekit-agents/livekit/agents/voice/report.py (5)
livekit-agents/livekit/agents/voice/agent_session.py (5)
llm(1285-1286)tts(1289-1290)vad(1293-1294)AgentSessionOptions(78-93)options(408-409)livekit-agents/livekit/agents/llm/llm.py (2)
ChatChunk(69-72)LLMOutputEvent(76-83)livekit-agents/livekit/agents/llm/chat_context.py (2)
ChatContext(218-656)to_dict(402-441)livekit-agents/livekit/agents/tts/tts.py (1)
SynthesizedAudio(33-44)livekit-agents/livekit/agents/types.py (2)
TimedString(95-126)to_dict(119-126)
livekit-agents/livekit/agents/llm/realtime.py (1)
livekit-agents/livekit/agents/llm/chat_context.py (1)
FunctionCall(179-192)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/voice/io.py (5)
AudioOutput(142-286)audio(431-432)audio(435-449)audio(555-556)audio(559-573)livekit-agents/livekit/agents/voice/run_result.py (7)
event(584-585)event(864-865)event(1001-1002)event(1011-1012)event(1021-1022)RunResult(71-229)done(141-143)
livekit-agents/livekit/agents/voice/events.py (7)
livekit-agents/livekit/agents/llm/llm.py (2)
LLMOutputEvent(76-83)LLM(98-162)livekit-agents/livekit/agents/stt/stt.py (1)
SpeechEvent(70-74)livekit-agents/livekit/agents/tts/tts.py (1)
SynthesizedAudio(33-44)livekit-agents/livekit/agents/types.py (1)
FlushSentinel(38-39)livekit-agents/livekit/agents/vad.py (1)
VADEvent(26-68)livekit-agents/livekit/agents/voice/io.py (2)
PlaybackFinishedEvent(119-127)PlaybackStartedEvent(131-134)livekit-agents/livekit/agents/voice/room_io/types.py (1)
TextInputEvent(32-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
- GitHub Check: livekit-plugins-cartesia
- GitHub Check: livekit-plugins-openai
- GitHub Check: livekit-plugins-deepgram
- GitHub Check: unit-tests
- GitHub Check: type-check (3.9)
- GitHub Check: type-check (3.13)
🔇 Additional comments (7)
livekit-agents/livekit/agents/llm/__init__.py (1)
23-23: LGTM!The
LLMOutputEventis properly imported from the.llmmodule and correctly added to the__all__exports list, making it publicly accessible from thellmpackage.Also applies to: 102-102
livekit-agents/livekit/agents/types.py (1)
37-39: LGTM!The
FlushSentineldataclass with thetypediscriminator field enables proper participation in theInternalEventdiscriminated union.livekit-agents/livekit/agents/llm/realtime.py (1)
19-36: LGTM!The type discriminator fields added to
InputSpeechStartedEvent,InputSpeechStoppedEvent,MessageGeneration, andInputTranscriptionCompletedcorrectly enable these events to participate in the discriminatedInternalEventunion.Also applies to: 126-134
livekit-agents/livekit/agents/voice/agent_session.py (1)
643-646: LGTM!The audio output event listener lifecycle management is well-implemented:
- Listeners are attached when audio output is set during start
- Properly detached during close and when audio output changes
- Re-attached when a new audio output is configured
Also applies to: 805-808, 1324-1342
livekit-agents/livekit/agents/voice/report.py (1)
108-115: LGTM!The
_serialize_audio_framehelper correctly serializesAudioFrameto a dictionary with base64-encoded audio data.livekit-agents/livekit/agents/voice/events.py (2)
257-275: LGTM!The
InternalEventdiscriminated union is well-structured. All member types have the requiredtypediscriminator field. Pydantic handles the nestedAgentEventunion by flattening it during discrimination.
216-221: LGTM!Adding
Field(..., exclude=True)toErrorEvent.sourceis appropriate since the field contains non-serializable model instances (LLM,STT,TTS,RealtimeModel).
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
livekit-agents/livekit/agents/llm/realtime.py (1)
126-134: AlignInputTranscriptionCompleted.typewithEventTypes.
EventTypesincludes"input_audio_transcription_completed", but the new discriminator is"input_transcription_completed". This mismatch can break consumers relying on the discriminator string.🛠️ Proposed fix
- type: Literal["input_transcription_completed"] = "input_transcription_completed" + type: Literal["input_audio_transcription_completed"] = "input_audio_transcription_completed"
♻️ Duplicate comments (1)
livekit-agents/livekit/agents/voice/report.py (1)
40-89: Avoidasdict()onGenerationCreatedEventAsyncIterables.
asdict()deep-copies fields before you clearmessage_stream/function_stream, so it can still raise onAsyncIterablevalues. Build the dict directly (or replace streams before conversion) to avoid runtime failures when internal events are enabled.🔧 Proposed fix
- elif isinstance(e, GenerationCreatedEvent): - data = asdict(e) - data["message_stream"] = [] - data["function_stream"] = [] - internal_events_dict.append(data) - continue + elif isinstance(e, GenerationCreatedEvent): + internal_events_dict.append( + { + "type": e.type, + "user_initiated": e.user_initiated, + "response_id": e.response_id, + "message_stream": [], + "function_stream": [], + } + ) + continue
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
livekit-agents/livekit/agents/llm/realtime.pylivekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/llm/realtime.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.
Applied to files:
livekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/llm/realtime.py
🧬 Code graph analysis (2)
livekit-agents/livekit/agents/voice/agent_session.py (2)
livekit-agents/livekit/agents/voice/io.py (5)
AudioOutput(142-286)audio(431-432)audio(435-449)audio(555-556)audio(559-573)livekit-agents/livekit/agents/voice/run_result.py (7)
event(584-585)event(864-865)event(1001-1002)event(1011-1012)event(1021-1022)RunResult(71-229)done(141-143)
livekit-agents/livekit/agents/voice/report.py (6)
livekit-agents/livekit/agents/voice/agent_session.py (4)
llm(1286-1287)tts(1290-1291)vad(1294-1295)AgentSessionOptions(78-93)livekit-agents/livekit/agents/llm/llm.py (2)
ChatChunk(69-72)LLMOutputEvent(76-83)livekit-agents/livekit/agents/llm/chat_context.py (2)
ChatContext(218-656)to_dict(402-441)livekit-agents/livekit/agents/llm/realtime.py (1)
GenerationCreatedEvent(40-47)livekit-agents/livekit/agents/tts/tts.py (1)
SynthesizedAudio(33-44)livekit-agents/livekit/agents/types.py (2)
TimedString(95-128)to_dict(119-128)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: livekit-plugins-deepgram
- GitHub Check: livekit-plugins-cartesia
- GitHub Check: unit-tests
- GitHub Check: type-check (3.13)
🔇 Additional comments (12)
livekit-agents/livekit/agents/llm/realtime.py (4)
19-22: LGTM: discriminator added forInputSpeechStartedEvent.
24-28: LGTM: discriminator added forInputSpeechStoppedEvent.
30-36: LGTM: discriminator added forMessageGeneration.
39-47: LGTM: discriminator added forGenerationCreatedEvent.livekit-agents/livekit/agents/voice/report.py (3)
3-18: LGTM: imports updated for internal event serialization.
20-38: LGTM: SessionReport fields extended for internal events and timestamps.
90-123: LGTM: internal_events included in output and audio frames serialized.livekit-agents/livekit/agents/voice/agent_session.py (5)
44-56: LGTM: internal event tracking state wired into session.Also applies to: 328-365
370-379: LGTM:emit()andmaybe_collect()respectinclude_internal_events.
443-449: LGTM:RunResultnow carriesagent_sessionfor event collection.Also applies to: 654-655
452-542: LGTM:include_internal_eventssurfaced instart()and stored.
644-648: LGTM: playback listeners are attached/detached cleanly.Also applies to: 806-809, 1325-1344
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@livekit-agents/livekit/agents/voice/events.py`:
- Around line 257-275: The Field(discriminator="type") annotation on the
InternalEvent Annotated union is misleading because discriminated-union
validation via Pydantic is not used (events are serialized manually in report.py
with isinstance()/asdict()); remove the discriminator annotation from
InternalEvent (i.e., delete the Annotated wrapper using
Field(discriminator="type")) or, if you prefer to keep it, add a clear code
comment next to InternalEvent stating that Pydantic discrimination is not relied
upon and that VADEvent and SpeechEvent use Enum vs. Literals—either remove the
ineffective Field(discriminator="type") or document its non-usage so future
maintainers aren’t misled.
🧹 Nitpick comments (4)
livekit-agents/livekit/agents/voice/report.py (2)
60-70: Add fallback handling for unexpectedLLMOutputEvent.datatypes.The current logic handles
AudioFrame,str,TimedString, andChatChunk, but ife.datais an unexpected type, the originalasdict(e)result is kept—which may contain non-serializable data.♻️ Proposed fix
elif isinstance(e, LLMOutputEvent): data = asdict(e) if isinstance(e.data, AudioFrame): data["data"] = self._serialize_audio_frame(e.data) elif isinstance(e.data, str): data["data"] = e.data elif isinstance(e.data, TimedString): data["data"] = e.data.to_dict() elif isinstance(e.data, ChatChunk): data["data"] = e.data.model_dump(mode="json") + else: + logger.warning(f"Unknown LLMOutputEvent.data type: {type(e.data)}") + data["data"] = str(e.data) internal_events_dict.append({**data, "__created_at": created_at})
94-99: Consider conditionally includinginternal_eventsin output.The
internal_eventskey is always included in the output dict, even wheninclude_internal_eventsisFalse(resulting in an empty list). This is fine for API consistency, but if you want to reduce payload size when internal events are disabled:♻️ Optional change to conditionally include internal_events
+ result = { "job_id": self.job_id, "room_id": self.room_id, "room": self.room, "events": events_dict, - "internal_events": internal_events_dict, # ... rest of fields + } + if self.include_internal_events: + result["internal_events"] = internal_events_dict + return resultlivekit-agents/livekit/agents/voice/agent_session.py (2)
645-648: Minor optimization opportunity: Attach playback listeners only when internal events are enabled.Playback listeners are attached unconditionally, but
maybe_collect()only records events when_include_internal_eventsisTrue. Consider guarding listener attachment:♻️ Optional optimization
- if self._output.audio: + if self._output.audio and self._include_internal_events: self._prev_audio_output = self._output.audio self._prev_audio_output.on("playback_started", self.maybe_collect) self._prev_audio_output.on("playback_finished", self.maybe_collect)Note: You'd also need to adjust
_on_audio_output_changed()accordingly.
1326-1344: Clear_prev_audio_outputwhen audio output is set toNone.If
output.audiobecomesNone, the current code detaches listeners but doesn't clear_prev_audio_output. While calling.off()on an already-detached emitter is safe, explicitly clearing the reference improves clarity and prevents holding stale references.♻️ Proposed fix
def _on_audio_output_changed(self) -> None: if self._prev_audio_output is not None: self._prev_audio_output.off("playback_started", self.maybe_collect) self._prev_audio_output.off("playback_finished", self.maybe_collect) + self._prev_audio_output = None if ( self._started and self._opts.resume_false_interruption and (audio_output := self.output.audio) and not audio_output.can_pause ): logger.warning( "resume_false_interruption is enabled, but the audio output does not support pause, ignored", extra={"audio_output": audio_output.label}, ) if self.output.audio is not None: self._prev_audio_output = self.output.audio self._prev_audio_output.on("playback_started", self.maybe_collect) self._prev_audio_output.on("playback_finished", self.maybe_collect)
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
livekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/events.pylivekit-agents/livekit/agents/voice/report.py
🧰 Additional context used
📓 Path-based instructions (1)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Format code with ruff
Run ruff linter and auto-fix issues
Run mypy type checker in strict mode
Maintain line length of 100 characters maximum
Ensure Python 3.9+ compatibility
Use Google-style docstrings
Files:
livekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/voice/events.py
🧠 Learnings (1)
📚 Learning: 2026-01-22T03:28:16.289Z
Learnt from: longcw
Repo: livekit/agents PR: 4563
File: livekit-agents/livekit/agents/beta/tools/end_call.py:65-65
Timestamp: 2026-01-22T03:28:16.289Z
Learning: In code paths that check capabilities or behavior of the LLM processing the current interaction, prefer using the activity's LLM obtained via ctx.session.current_agent._get_activity_or_raise().llm instead of ctx.session.llm. The session-level LLM may be a fallback and not reflect the actual agent handling the interaction. Use the activity LLM to determine capabilities and to make capability checks or feature toggles relevant to the current processing agent.
Applied to files:
livekit-agents/livekit/agents/voice/agent_session.pylivekit-agents/livekit/agents/voice/report.pylivekit-agents/livekit/agents/voice/events.py
🧬 Code graph analysis (3)
livekit-agents/livekit/agents/voice/agent_session.py (1)
livekit-agents/livekit/agents/voice/run_result.py (6)
event(584-585)event(864-865)event(1001-1002)event(1011-1012)event(1021-1022)RunResult(71-229)
livekit-agents/livekit/agents/voice/report.py (4)
livekit-agents/livekit/agents/llm/chat_context.py (1)
to_dict(402-441)livekit-agents/livekit/agents/llm/realtime.py (1)
GenerationCreatedEvent(40-47)livekit-agents/livekit/agents/types.py (2)
TimedString(95-128)to_dict(119-128)livekit-agents/livekit/agents/vad.py (2)
VADEvent(26-68)VADEventType(19-22)
livekit-agents/livekit/agents/voice/events.py (3)
livekit-agents/livekit/agents/stt/stt.py (1)
SpeechEvent(70-74)livekit-agents/livekit/agents/vad.py (1)
VADEvent(26-68)livekit-agents/livekit/agents/voice/room_io/types.py (1)
TextInputEvent(32-36)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unit-tests
- GitHub Check: livekit-plugins-inworld
- GitHub Check: type-check (3.9)
- GitHub Check: livekit-plugins-deepgram
🔇 Additional comments (16)
livekit-agents/livekit/agents/voice/events.py (4)
5-5: LGTM!The addition of
Annotated,TypeAlias, andUnionimports is appropriate for defining the discriminated unionInternalEventtype.
10-34: LGTM!The imports are well-organized and all are used in constructing the
InternalEventunion type.
220-220: LGTM!Adding
exclude=Trueto thesourcefield is appropriate since it contains non-serializable model instances.
277-277: LGTM!The
TimedInternalEventtype alias cleanly represents a timestamped event tuple, aligning with how events are collected inAgentSession.emit()andmaybe_collect().livekit-agents/livekit/agents/voice/report.py (5)
3-18: LGTM!The imports are well-organized and necessary for the internal event serialization functionality.
22-39: LGTM!The new fields
include_internal_eventsandinternal_eventsare properly typed and documented. The timestamp fields provide good observability for session lifecycle tracking.
71-78: LGTM!VADEvent handling correctly skips noisy
INFERENCE_DONEevents and properly clears frames with an empty list[](preserving the expected list type).
79-88: LGTM!
GenerationCreatedEventhandling correctly strips the non-serializableAsyncIterablestreams while preserving essential metadata.
120-127: LGTM!The
_serialize_audio_framehelper cleanly encodes audio frame data with base64, preserving all necessary metadata for reconstruction.livekit-agents/livekit/agents/voice/agent_session.py (7)
52-53: LGTM!The imports for
InternalEventandTimedInternalEventare properly added for internal event collection.
329-329: LGTM!The
_prev_audio_outputfield is properly added for tracking the audio output to manage playback event listeners.
363-365: LGTM!The internal event recording state is properly initialized with correct types.
371-380: LGTM!The
emit()method now correctly guards internal event collection with_include_internal_events, addressing the previous memory growth concern. Themaybe_collect()method provides a clean interface for collecting non-AgentEventinternal events.
448-448: LGTM!
RunResultnow receivesagent_sessionreference for internal event collection delegation.
453-495: LGTM!The
include_internal_eventsparameter is properly added to all overload signatures and the main implementation with appropriate documentation.
807-810: LGTM!Playback listeners are properly detached during session close, preventing memory leaks and stale callbacks.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| InternalEvent = Annotated[ | ||
| Union[ | ||
| AgentEvent, | ||
| VADEvent, | ||
| RunEvent, | ||
| SpeechEvent, | ||
| InputSpeechStartedEvent, | ||
| InputSpeechStoppedEvent, | ||
| InputTranscriptionCompleted, | ||
| GenerationCreatedEvent, | ||
| PlaybackFinishedEvent, | ||
| PlaybackStartedEvent, | ||
| TextInputEvent, | ||
| SynthesizedAudio, | ||
| FlushSentinel, | ||
| LLMOutputEvent, | ||
| ], | ||
| Field(discriminator="type"), | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find the file and get context
echo "=== Locating events.py ==="
find . -name "events.py" -path "*/voice/*" | head -5
# Check VADEvent definition
echo -e "\n=== VADEvent definition ==="
rg -n "class VADEvent" -A 15 --type py
# Check SpeechEvent definition
echo -e "\n=== SpeechEvent definition ==="
rg -n "class SpeechEvent" -A 15 --type py
# Check FlushSentinel definition
echo -e "\n=== FlushSentinel definition ==="
rg -n "class FlushSentinel" -A 10 --type py
# Check TextInputEvent for comparison
echo -e "\n=== TextInputEvent definition ==="
rg -n "class TextInputEvent" -A 10 --type py
# Check if these classes use `@dataclass` or inherit from BaseModel
echo -e "\n=== Checking class decorators/inheritance ==="
rg -n "@dataclass|@pydantic|BaseModel" -B 1 --type py | grep -A 1 "VADEvent\|SpeechEvent\|TextInputEvent\|FlushSentinel"Repository: livekit/agents
Length of output: 6299
🏁 Script executed:
#!/bin/bash
# Check the complete InternalEvent definition
echo "=== Complete InternalEvent definition ==="
rg -n "InternalEvent = " -A 20 --type py
# Check if InternalEvent is used with Pydantic validation
echo -e "\n=== Usage of InternalEvent with Pydantic ==="
rg "InternalEvent" --type py | grep -i "parse\|validate\|model_" | head -10
# Check report.py for serialization handling
echo -e "\n=== Check report.py ==="
find . -name "report.py" -path "*/voice/*" -exec head -50 {} \;
# Check all `@dataclass` decorators on union members
echo -e "\n=== Checking all event classes for `@dataclass` ==="
for event in "AgentEvent" "RunEvent" "InputSpeechStartedEvent" "InputSpeechStoppedEvent" "InputTranscriptionCompleted" "GenerationCreatedEvent" "PlaybackFinishedEvent" "PlaybackStartedEvent" "SynthesizedAudio" "LLMOutputEvent"; do
echo "--- $event ---"
rg -n "class $event" -B 2 -A 8 --type py | head -12
doneRepository: livekit/agents
Length of output: 10348
🏁 Script executed:
#!/bin/bash
# Check AgentEvent and RunEvent
echo "=== AgentEvent definition ==="
rg -n "class AgentEvent" -A 15 --type py
echo -e "\n=== RunEvent definition ==="
rg -n "class RunEvent" -A 15 --type py
# Check SynthesizedAudio for type field
echo -e "\n=== SynthesizedAudio complete definition ==="
rg -n "class SynthesizedAudio" -A 20 --type py
# Check how internal_events are serialized in report.py
echo -e "\n=== report.py serialization of internal_events ==="
find . -name "report.py" -path "*/voice/*" -exec cat {} \; | grep -A 30 "internal_events"Repository: livekit/agents
Length of output: 7029
Discriminator annotation is ineffective and misleading for this union.
The InternalEvent union includes Field(discriminator="type"), but this annotation is never used for Pydantic validation. InternalEvent is serialized manually in report.py with isinstance() checks and asdict()—Pydantic's discriminated union validation is not applied.
Additionally, only VADEvent and SpeechEvent use enum-typed type fields; the remaining 11+ members correctly use Literal types. Since discriminator validation isn't actually used, this annotation serves no purpose and may mislead future maintainers into thinking Pydantic validation is active.
Consider removing the discriminator annotation if runtime validation isn't needed, or document its intention.
🤖 Prompt for AI Agents
In `@livekit-agents/livekit/agents/voice/events.py` around lines 257 - 275, The
Field(discriminator="type") annotation on the InternalEvent Annotated union is
misleading because discriminated-union validation via Pydantic is not used
(events are serialized manually in report.py with isinstance()/asdict()); remove
the discriminator annotation from InternalEvent (i.e., delete the Annotated
wrapper using Field(discriminator="type")) or, if you prefer to keep it, add a
clear code comment next to InternalEvent stating that Pydantic discrimination is
not relied upon and that VADEvent and SpeechEvent use Enum vs. Literals—either
remove the ineffective Field(discriminator="type") or document its non-usage so
future maintainers aren’t misled.
Summary by CodeRabbit
New Features
Refactor
✏️ Tip: You can customize this high-level summary in your review settings.