Skip to content

background-task lifecycle v2: terminal detection for Bash-bg/Agent-bg/ScheduleWakeup/RemoteTrigger + child-PID cleanup (#374 follow-up) #573

Description

Summary

Follow-up to #374. v0.35.4 Batch 1 shipped the Monitor-only subset of the
background-task lifecycle fix: _clear_background_handle gained an is_terminal gate +
_is_terminal_tool_result, so a streaming Monitor no longer drops the
stall_monitor_active_suppressed branch on its first interim tool_result (terminal =
is_error OR the Monitor's timeout_ms deadline; the result text is arbitrary command
stdout so it is deliberately NOT scanned for "completed"/"done" markers).

This issue tracks the general (v2) lifecycle work that was deliberately deferred.

Why it was deferred (the inverse-hang risk)

Making tool_result events non-terminal for Bash(run_in_background), Agent(run_in_background),
ScheduleWakeup, and RemoteTrigger — without a reliable alternative clear path — would
re-introduce the opposite failure: a handle that never clears keeps has_live_background_work()
True forever, so the post-result idle watchdog (#333/#507) never closes stdin → permanent wedge.
A Monitor is safe to defer because its live_monitors deadline ages the handle out; the others
are sets/no-deadline (or fire-later), so they need explicit terminal detection first.

Scope

  • KillShell observer → clear the matching live_bg_bashes entry (explicit terminal for bg-Bash).
  • Subprocess-exit detection → clear bg-Bash / bg-Agent handles when the child actually exits
    (the only true terminal signal for a backgrounded process — there is no "done" tool_result).
  • Deadline-expiry sweep so ScheduleWakeup / RemoteTrigger / unknown-deadline handles age out
    without relying on a tool_result.
  • Orphaned child-PID cleanup on session end — SIGTERM Bash-spawned children (or cgroup-walk
    non-MCP descendants) when stdin closes. Evidence: the 2026-05-03 comment on background-task lifecycle: clear handle on terminal signal, not first tool_result #374 (a bulk-reembed
    poll loop left 2 zsh shells alive 1h49m after the job finished). OWASP ASI08 (excessive
    autonomy) / ASI10 (rogue-agent cleanup).
  • Regression test: a bg-Bash that finishes naturally (not killed) clears its handle and does
    NOT wedge the post-result idle watchdog.

References

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions