Skip to content

verify and document /monitor end-to-end via Untether Loop mode (no new code expected) #529

Description

Context

While planning multi-host fleet monitoring (see docs/plans/2026-05-13-fleet-monitoring-and-upgrades.md), the question came up: can the /monitor Claude-Code slash command — which uses /loop to fire recurring audit passes — actually run end-to-end via Untether (Telegram → stream-json subprocess), or is it CLI-only?

Initial research suggested CLI-only because the subprocess exits between passes, and ScheduleWakeup is documented as a no-op outside /loop dynamic mode. That conclusion was wrong — it ignored the fact that v0.35.3 already shipped a comprehensive solution for exactly this case via #289, #507, #481, and #470. The current best understanding is that /monitor probably already works via Untether provided Loop mode is enabled for the chat, but it has never been tested end-to-end and isn't called out as a worked example anywhere in the docs.

This issue is to verify the behaviour, document it, and fix any gaps.

What v0.35.3 already provides (foundation)

Scope of this issue

Verify + document. No new code unless a bug surfaces.

Hypothesis: with Loop mode enabled for the chat, typing `/monitor untether-staging 30m 5m` in Telegram should:

  1. Pass 1 runs inline in the first Claude Code subprocess.
  2. The monitor command's `Skill(skill="loop", args="5m Read .../loop-prompt.md ...")` invocation is detected by the loop-scheduler observer (the observer parses upstream `CronCreate` / `ScheduleWakeup` tool events from stream-json — the `/loop` skill itself ultimately fires one of those primitives under the hood).
  3. Untether persists the loop to `active_loops.json`, restart-resilient.
  4. Untether re-fires `/monitor untether-staging` with the existing run-id every 5 minutes for 30 minutes total.
  5. Each re-fire reads the existing state dir, runs the next pass, writes findings + audit-log + GitHub issue updates.
  6. Window-close behaviour (the loop-prompt.tmpl's `if [ "$NOW" -ge "$END_TS" ]` guard) triggers the synthesis pass exactly once, then short-circuits subsequent fires via `.synthesis-done` marker.

The big unknown: whether the monitor command's specific invocation shape gets observed correctly. The observer in #289 looks for `CronCreate` / `ScheduleWakeup` tool calls; `/monitor` calls the loop skill, which presumably ends up firing one of those. If the wiring is right, this Just Works. If not, the loop-scheduler observer needs a small extension to recognise the skill-driven path.

Tasks

  • Enable Loop mode for a test chat: `/config → 🔁 Loop mode → on`. Verify the warning message about cost+quota appears.
  • In the same chat, fire `/monitor untether-staging 30m 5m` (short window, short interval — easier to observe). Watch logs.
  • Confirm pass 1 runs and writes to `~/.local/state/monitor/untether-staging//audit-log.md`.
  • Confirm `~/.untether/active_loops.json` gains an entry for this loop after pass 1 completes.
  • Wait ~5 min, confirm pass 2 fires automatically via Untether (no Telegram input needed). Verify the structlog event chain: `claude.loop.observed` (or whatever feat: full /loop support — agent self-pacing via ScheduleWakeup interception #289 emits) → `claude.loop.fired` → `handle.incoming` with `Loop iteration 2: …` prefix → pass 2 audit-log entry.
  • Confirm subsequent passes (3, 4, 5, 6) fire on schedule.
  • At window-close (30 min in), confirm synthesis pass fires exactly once. Subsequent loop fires should short-circuit silently.
  • Drain-on-restart test: mid-window, run `systemctl --user restart untether-dev`. Verify the loop survives restart (`active_loops.json` reloads, fires resume from where they left off).
  • Drain-on-cancel test: `/cancel` from Telegram. Verify the loop is removed from `active_loops.json` and no further fires happen.

Documentation acceptance

Out of scope

  • Fleet meta-target (`/monitor untether-fleet`) — that's tracked separately in the fleet rollout plan (docs/plans/2026-05-13-fleet-monitoring-and-upgrades.md). Once this issue confirms single-host `/monitor` works via Untether, the fleet variant is a thin extension.
  • Loop mode default = ON — keep it opt-in; cost surface argues for the deliberate user gesture.
  • Auto-enable Loop mode when `/monitor` is invoked — too magical. Better to surface a clear error message "Loop mode is OFF for this chat — `/monitor` needs it on to continue past pass 1. Toggle via `/config → 🔁 Loop mode`." and let the user opt in deliberately.

Cross-links

Risk

Low. The infrastructure is already shipped and well-tested (58 tests for #289 alone). The most likely failure mode is a small wiring mismatch between how `/monitor` invokes `/loop` and what the observer expects — fixable in a follow-up PR if it surfaces. Worst case, the issue gets closed as "works as expected" with just doc updates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions