Skip to content

docs: link Untether blog posts in README footer#534

Open
Nathan Schram (nathanschram) wants to merge 39 commits into
masterfrom
feature/readme-blog-links
Open

docs: link Untether blog posts in README footer#534
Nathan Schram (nathanschram) wants to merge 39 commits into
masterfrom
feature/readme-blog-links

Conversation

@nathanschram

Copy link
Copy Markdown
Member

Summary

  • Add "From the blog" section between Acknowledgements and Licence
  • Links two littlebearapps.com posts directly relevant to Untether:
    • Coding from the park — why Untether exists
    • Dogfooding bugs tests can't find — integration testing rationale
  • Uses <a target="_blank" rel="noopener noreferrer"> so links open in a new tab on GitHub (PyPI's readme_renderer may strip target — acceptable)

Minor SEO/referral win for littlebearapps.com — README is mirrored on PyPI (dofollow there), drives referral traffic, and increases AI-citation surface.

Test plan

  • Confirm both URLs return 200 (verified at commit time)
  • Confirm rendered README on GitHub opens links in a new tab
  • Confirm PyPI package page still renders the section (target may be stripped, links should still resolve)

🤖 Generated with Claude Code

Enables `[claude] extra_args = ["--chrome"]` so Untether-spawned Claude
Code sessions can opt into the Claude-in-Chrome extension — previously
the `mcp__claude-in-chrome__*` tool namespace was absent from Untether
sessions because Claude Code 2.1.x gates it behind `--chrome` /
`CLAUDE_CODE_ENABLE_CFC=1`, and Untether never passed the flag.

Mirrors `codex.extra_args` and `pi.extra_args`. Flags Untether manages
internally (`-p`, `--print`, `--output-format`, `--input-format`,
`--resume`/`-r`, `--continue`/`-c`, `--permission-mode`,
`--permission-prompt-tool`) are rejected at config-load with a
`ConfigError` so duplicate-argv surprises fail fast. User args land on
argv after the managed stream-json prelude and before resume / model /
effort / allowed-tools / permission flags, preserving the trailing
`-p <prompt>` (or stdin prompt under permission-mode) position.

- src/untether/runners/claude.py: add `extra_args` field, thread
  through `build_args`, parse + validate in `build_runner`
- tests/test_build_args.py: +8 tests (argv ordering, permission-mode
  argv, multi-flag order, build_runner parsing, reserved-flag rejection
  for individual flags and `key=value` prefixes)
- docs/reference/config.md, docs/reference/runners/claude/runner.md:
  document the new key, including reserved-flag list
- CHANGELOG.md: v0.35.3 (unreleased) entry

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: staging 0.35.3rc1

Stage Claude extra_args (#407) for TestPyPI. This rc1 is the wheel the Mac
Untether instance will install to validate Claude-in-Chrome end-to-end per
docs/audits/2026-04-21-claude-in-chrome-test-plan.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* deps: bump lxml 6.0.2→6.1.0 and python-dotenv 1.2.1→1.2.2

pip-audit flagged two new transitive CVEs after PR #408 merged:
- lxml 6.0.2: CVE-2026-41066 (fix 6.1.0) — pulled via sulguk
- python-dotenv 1.2.1: CVE-2026-28684 (fix 1.2.2) — pulled via
  pydantic-settings

Both have clean fixes. Lockfile-only change; pyproject.toml constraints
unchanged. Local pip-audit clean after bump.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(security): Group 1A hygiene — 8 issues

Bundles eight low-risk security hygiene fixes for v0.35.3:

- #205 — split runner.start log so prompt content stays at DEBUG
- #206 — flip AMP dangerously_allow_all default to False (opt-in only)
- #207 — Pi session dir created with mode 0o700 + chmod existing
- #208 — extend stderr sanitisation to /Users, /private/var, /tmp,
        /var, /opt, /srv, /etc, /usr/local, /app, /workspace, /root
- #211 — replace stat()+read_bytes() with capped streaming read in
        anyio worker thread; closes TOCTOU window on /file get
- #213 — add OPENAI_PROJECT_KEY_RE for sk-proj-... redaction (the
        underscore/hyphen char set is not covered by the generic
        sk- pattern)
- #402 — bump Pygments 2.19.2 → 2.20.0 via uv lock (CVE-2026-4539
        ReDoS, transitive)
- #403 — replace 123456789:ABCdef… placeholder bot tokens with
        <BOT_ID>:<BOT_TOKEN> in non-test paths (onboarding.py,
        install.md, llms-full.txt); test fixtures kept as-is for
        GitHub-UI dismissal

All 2410 tests pass; ruff check + format clean; uv lock --check ok.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ci: silence bandit B108 false positive + ignore CVE-2026-3219

- bandit B108 fires on the new /tmp/ regex pattern in
  _PATH_PATTERNS at runner.py — regex for stderr redaction, not
  a hardcoded temp-file write. Suppressed with `# nosec B108`
  matching the existing render.py:111 pattern.

- pip-audit now flags pip 26.0.1 → CVE-2026-3219 (advisory
  published recently; no fix available upstream). Added to the
  --ignore-vuln list alongside CVE-2026-4539 (pygments — kept
  for posterity even though #402 lockfile bump fixed it).

No source/test code changes. CI-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
)

`_daily_cost` is a module-level tuple updated via read-modify-write
in record_run_cost(). Concurrent finalize_run callers could both
read (today, X), both write (today, X + cost), and lose one run's
cost — letting a malicious or runaway concurrent workload defeat
the per-day budget gate.

Fix: wrap the RMW block in a `threading.Lock`. Critical section is
a single tuple assignment (sub-microsecond), so the lock is fine
under both async (cooperative) and threaded callers without an
async-signature ripple. get_daily_cost() also acquires the lock for
snapshot consistency.

Trade-off note: kept the function sync rather than pivoting to
`anyio.Lock` because that would require updating the 6 sync test
call sites and the 1 sync caller in runner_bridge.py — needless
churn for a sub-microsecond critical section.

Test: new ThreadPoolExecutor-driven fuzz test (16 workers, 200
calls) asserts the observed total equals n * unit_cost — would
fail under racing RMW.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings the voice transcription API key into parity with `bot_token`
(closed #196): SecretStr masks the value in repr()/str()/tracebacks
and any accidental structlog serialisation. Access the raw value
via `.get_secret_value()` at the transport boundary.

Changes:
- `settings.py`: field type `NonEmptyStr | None` → `SecretStr | None`;
  new `_validate_voice_key_not_empty` validator preserves the prior
  no-empty-string contract by round-tripping `""`/whitespace to None
- `telegram/bridge.py`: `TelegramBridgeConfig.voice_transcription_api_key`
  annotation → `SecretStr | None`; `update_from()` unchanged (assigns
  SecretStr to SecretStr)
- `telegram/loop.py:2208`: sole unwrap point — call
  `.get_secret_value()` only when non-None before passing to
  `transcribe_voice` (OpenAI SDK still wants raw `str | None`)
- `telegram/voice.py`: unchanged; boundary stays at the loop caller

Tests:
- `test_settings.py`: new `test_voice_transcription_api_key_is_secret_str`
  (round-trip + repr/str masking), `_empty_string_normalised_to_none`
  (whitespace → None), `_default_none` (omitted → None)
- `test_bridge_config_reload.py`: hot-reload tests updated to use
  `.get_secret_value()` for value comparison
- `test_telegram_backend.py`: updated build_and_run assertion

All 2413 tests pass; ruff check + format clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bump rc1 → rc2 to publish a fresh staging wheel that includes:

- #431 — Group 1A security hygiene (8 issues: #205, #206, #207, #208,
        #211, #213, #402, #403)
- #432#379 daily cost tracker race (threading.Lock guard)
- #433#378 voice_transcription_api_key SecretStr

rc1 (b6c6ad6) only carried #407 (Claude extra_args). rc2 supersedes
it on TestPyPI.

No CHANGELOG entry — per release-discipline.md §"Staging / rc
versions", entries batch into the stable bump.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ult (#409) (#435)

Self-installed Untether users in heterogeneous environments need to
thread credential-manager tokens (1Password, Doppler, Vault, Infisical,
…) into engine subprocesses. Today the env allowlist is hard-coded in
`utils/env_policy.py` so adding a single var requires a fork + release.

Changes:
- `utils/env_policy.py`:
  - new `is_allowed_with_extras(name, extra_exact=, extra_prefix=)`
  - `filtered_env()` extended with `extra_prefix=` parameter
  - new `log_user_extensions_once()` — module-level latch emits one
    `env_policy.user_extension` INFO per process when user extras are
    active, so the operator sees the addition in journalctl
- `settings.py` `SecuritySettings`:
  - `env_extra_allow: list[str]` (default `[]`)
  - `env_extra_prefix_allow: list[str]` (default `[]`)
  - field validators reject empty/whitespace and enforce `[A-Z_][A-Z0-9_]*`
- `runners/claude.py`, `runners/pi.py`:
  - new `_load_env_extras()` helper (best-effort settings load — never
    blocks a run on a config error, mirrors the env_audit pattern)
  - threads extras through `filtered_env()` + `log_user_extensions_once()`
- `utils/env_audit.py` `audit_proc_env()`:
  - new `user_extra_exact=`/`user_extra_prefix=` params so user-allowed
    names aren't false-flagged as `claude.env_audit.leaked_var`
- Built-in defaults: `BWS_ACCESS_TOKEN` promoted into `_EXACT_ALLOW`
  (Bitwarden Secrets Manager — common enough to ship as a default).
- Docs: `docs/reference/config.md` `[security]` table, CLAUDE.md
  features list.

Tests: +19 across `tests/test_env_policy.py` (8 user-extension cases +
log latch), `tests/test_env_audit.py` (4 user-extras cases), and
`tests/test_settings.py` (7 round-trip + validator cases).

`uv run pytest` → 2432 passed, 2 skipped; ruff clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bump rc2 → rc3 to publish a fresh staging wheel that includes #435.

Cumulative since rc1:
- #431 — Group 1A security hygiene (8 issues: #205, #206, #207, #208,
        #211, #213, #402, #403)
- #432#379 daily cost tracker race (threading.Lock guard)
- #433#378 voice_transcription_api_key SecretStr
- #435#409 user-extensible env allowlist + BWS_ACCESS_TOKEN default

No CHANGELOG entry — per release-discipline.md §"Staging / rc versions",
entries batch into the stable bump.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) (#437)

#377 fix:
- `TelegramTransportSettings` gains `allow_any_user: bool = False` (opt-in
  escape hatch) and `_validate_allowed_user_ids_or_optin` model_validator
  raising ValueError when `allowed_user_ids == []` and `allow_any_user is
  False`. Pre-v0.35.3 the empty default silently shipped open bots —
  this is the v0.35.3 promotion of the warning to a hard ConfigError.
- `TelegramBridgeConfig` and `update_from()` carry the new field through
  hot-reload; backend constructs with the value.
- `telegram/loop.py` drops the per-update `security.no_allowed_users`
  warning (validator now blocks startup) and emits
  `security.allow_any_user` INFO every boot when the opt-out is in
  effect.
- `config_migrations.py` `_migrate_legacy_telegram` relocates a top-level
  `allow_any_user` key into `[transports.telegram]` alongside `bot_token`
  / `chat_id` so legacy configs migrate cleanly.

CHANGELOG: backfilled `## v0.35.3 (unreleased)` with `### breaking`,
`### changes`, `### fixes` subsections covering all 13 issues that
shipped in rc1-rc4 (#205, #206, #207, #208, #211, #213, #377, #378,
#379, #402, #403, #407, #409). Per release-discipline.md the section
heading stays `(unreleased)` until the dev → master stable bump
populates the date.

Docs sweep:
- `docs/how-to/security.md` — required-allowlist wording, dev/demo
  opt-out callout, env_extra_allow / env_extra_prefix_allow extension
  guide, sk-proj redaction note, voice-key SecretStr note.
- `docs/how-to/troubleshooting.md` — new top-of-page section for
  `allowed_user_ids is empty` startup error.
- `docs/how-to/group-chat.md` — required wording.
- `docs/how-to/operations.md` — `env_extra_allow` + `allow_any_user`
  added to hot-reloadable list.
- `docs/tutorials/install.md` — `allowed_user_ids` added to all three
  example configs (assistant / workspace / handoff).
- `docs/reference/config.md` — `allow_any_user` row added,
  `allowed_user_ids` flipped to required, AMP `dangerously_allow_all`
  default note flipped to `false`.
- `docs/reference/runners/amp/runner.md` — flag is now optional;
  `dangerously_allow_all = false` example.
- `docs/reference/env-vars.md` — `BWS_ACCESS_TOKEN` default mention,
  `[security] env_extra_*` extension subsection.

Test fixtures:
- ~30 test fixtures across `test_settings`, `test_cli_*`,
  `test_projects_config`, `test_telegram_backend`,
  `test_bridge_config_reload`, `test_config_watch`,
  `test_config_path_env`, `test_onboarding*`, `test_runtime_loader`,
  `test_settings_contract`, `test_exec_bridge` patched to add
  `allow_any_user = true` (or `"allow_any_user": True`) where the
  fixture exercises non-allowlist behaviour. Tests that specifically
  cover #377 use `populated allowlist` cases.

#377 tests: 4 new in `test_settings.py` covering block + opt-out +
populated + both-set.

GitHub housekeeping (parallel to this commit, not in the diff):
- Closed #205, #206, #207, #208, #211, #213, #378, #379, #402, #403,
  #409 with implementation references. #377 closes via this PR's body.

Version: 0.35.3rc3 → 0.35.3rc4 (`pyproject.toml`, `uv.lock`).

Verification: 2436 tests pass / 2 skipped (~68s). Ruff check + format
clean. uv lock --check in sync.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the literal "Basic dXNlcjpwYXNz" string in test_malformed_bearer_header
with a runtime-constructed header so GitHub's secret-scanner stops flagging it.
The test still asserts verify_auth rejects Basic auth — Untether webhooks only
accept Bearer + HMAC.

The corresponding GitHub secret-scanning alert is a true false positive (test
fixture, not a real credential) and will be dismissed in the GitHub UI as
"Used in tests / false positive".

Closes #404
…-approve safety (#380) (#442)

The 2026-04-20 audit (§ASI02) flagged
``ControlRewindFilesRequest`` and ``ControlMcpMessageRequest`` as worth
a deeper look because rewind could in principle undo state that drove a
prior denial decision and MCP messages could carry tainted payloads
from a compromised MCP server.

Audit verdict: both are safe to auto-approve under the current upstream
Claude Code 2.1.x trust model.

- mcp_message: Untether is a transport pass-through; the message
  payload is opaque storage and is never inspected, executed, or
  rendered. A compromised MCP server is the inherent threat model of
  any MCP server, not specific to auto-approve. Routing this through
  Telegram approval would not block the payload.
- rewind_files: rewind is user-initiated upstream (the model cannot
  trigger it autonomously). Untether's per-session approval state
  (_PLAN_EXIT_APPROVED, _DISCUSS_APPROVED, _HANDLED_REQUESTS) is NOT
  mutated by rewind. Subsequent writes still pass through the standard
  ControlCanUseToolRequest gate.

No code change beyond:

1. Multi-paragraph safety-invariant comment in
   src/untether/runners/claude.py near _AUTO_APPROVE_TYPES, including
   the re-audit trigger (upstream semantic change to either subtype).
2. 3 regression-lock tests in
   tests/test_claude_control.py::TestAutoApproveSafetyInvariant
   that fail loudly if the auto-approve path starts inspecting payloads
   or coupling to per-session approval state.
3. Audit memo at docs/audits/2026-04-27-380-auto-approve-scope-review.md.

Closes #380

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (#440)

The chat-level message-routing command (`all` / `mentions` / `clear`)
shared a name with the unrelated webhook/cron triggers system, which
became increasingly confusing as `/config` grew separate trigger pages.

User-visible changes:
- New `/listen` command (`all`/`mentions`/`clear`) replaces `/trigger`
- `/trigger` continues to work as a deprecated alias for one release
  cycle and prepends a one-line deprecation notice
- `/config → 📡 Listen` page replaces `📡 Trigger`
- Home page summary renders `Listen: all` instead of `Trigger: all`
- Bot command menu lists `listen` instead of `trigger`

Internal renames:
- `telegram/trigger_mode.py` → `telegram/listen_mode.py`
- `commands/trigger.py` → `commands/listen.py`
- Type `TriggerMode` → `ListenMode`
- Function `resolve_trigger_mode` → `resolve_listen_mode`
- ChatPrefsStore / TopicStateStore: new `*_listen_mode` methods;
  legacy `*_trigger_mode` methods preserved as one-release aliases

Storage: msgspec field is still named `trigger_mode` for backward
compat with existing `telegram_chat_prefs_state.json` /
`telegram_topics_state.json` files. No migration is needed.

Tests: full suite passes (2438 passed, 2 skipped). Two new tests in
test_telegram_agent_trigger_commands.py cover the deprecation prefix
and clean `/listen` output. test_config_command toast expectations
updated to "Listen: ...".

Closes #297

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a global pause control for the trigger system (crons + webhooks)
accessible via /config in Telegram. During pause:
- Cron scheduler skips its tick — run_once crons are NOT consumed and
  fire on the next matching tick after resume
- Webhook server returns 503 (with Retry-After: 60) instead of
  dispatching, so external monitors can distinguish paused-but-up from
  healthy. Returns 404 for unknown paths as before
- /health endpoint surfaces {"status":"paused","paused":true}

Pause is in-memory only — restart auto-resumes. This is the safe
default per the issue's recommendation, and mirrors /at scheduler
behaviour.

UI:
- New /config home-page row "⏸ Pause triggers" / "▶️ Resume triggers"
  appears only when triggers are configured
- New dedicated "📡 Triggers" page (config:tg) showing state + counts
  with Pause/Resume button; gracefully handles no-trigger-manager
  and zero-config cases
- /ping shows "⏸ triggers paused: … (suspended)" indicator while paused

Tests: 15 new tests across test_trigger_manager.py (8 pause toggle
behaviours including 503 webhook check), test_ping_command.py
(2 paused/resumed indicators), and test_config_command.py
(5 TestTriggersPage covering unavailable/empty/pause/resume/toast).
Full suite: 2445 passed, 2 skipped.

Closes #294

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fication (#438) (#443)

Adds [watchdog] claude_stream_idle_timeout_ms (default 300_000 ms,
range 30 s – 30 min) so deployments hitting upstream Anthropic API
stalls on long opus 4.7 1M plan-mode generations can raise the
watchdog without forking the codebase. Untether's Claude runner reads
the value via setdefault — shell-set CLAUDE_STREAM_IDLE_TIMEOUT_MS
still wins. Settings load failure falls back to the hardcoded 300_000
default with a debug log entry.

Type-A vs Type-B classification on the failure message:

- Type A — mid-generation stall (num_turns >= 1 && duration_api_ms > 0).
  Often legitimate long opus reasoning that exceeded the watchdog.
  Inline hint suggests raising the new config knob.
- Type B — cold-start zero-byte stall (num_turns <= 1 && duration_api_ms
  == 0). Upstream API outage — raising the timeout will NOT help.
  Inline message says so explicitly.

Auto-retry on Stream idle timeout deferred to v0.35.4 pending upstream
Anthropic stabilisation (8 duplicate api:anthropic issues filed
2026-04-17→26 across macOS/Windows/web/WSL).

Tests: 5 new tests in test_claude_runner.py. Full suite 2460 passed,
2 skipped. Lint clean.

Closes #438

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…410) (#444)

Promotes claude_usage.schema_mismatch from one-shot per-process to
per-call counter so the issue-watcher catches ongoing API-shape drift
instead of just the first hit. Structured event carries a cumulative
`count` field; new runner_bridge.get_usage_schema_mismatch_count()
exposes the counter for the debug page.

UsageCacheStats added to utils/usage_cache.py tracking last successful
fetch wall time, cache age, last-error class+message; populated on
every fetch path including stale-while-error fallbacks.

_read_token_expiry_ms() added to telegram/commands/usage.py so the
OAuth token expiry can be surfaced without raising on missing
credentials (best-effort: returns None on any read failure).

/usage debug appends a 🔧 debug block (HTML) showing:
- last successful fetch (UTC ISO + age + fresh/stale label)
- last error (class + message, 120-char truncated)
- OAuth token expiry (with hh/mm remaining)
- cumulative schema-mismatch counter

Operator-facing signal so the next time the subscription footer goes
silent, the root cause is visible without grepping journalctl.

Tests: 5 new in test_usage_cache.py::TestCacheStatsObservability;
1 in test_command_engine_gates.py::TestUsageDebugMode; existing
test_schema_mismatch_warning_fires_once repurposed to assert per-call
firing with cumulative counts. Full suite: 2465 passed, 2 skipped.

Closes #410

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n + last-fired history + /stats breakdown (#271) (#445)

Tier 2: `/config → ⏰ Triggers` now lists every cron and webhook configured
for the current chat. Crons render as `id · describe_cron(...) · proj · eng ·
last X` and webhooks as `id · path · auth · proj · eng · last X`. Lists are
scoped via `crons_for_chat`/`webhooks_for_chat` with the bridge default_chat_id
fallback, capped at 10 entries with an overflow marker, and omitted when the
chat has no triggers (pause/resume controls remain regardless).

Tier 3: new `triggers/history.py` JSON store at
`<config_path>.with_name("triggers_history.json")`. Records `time.time()`
after every successful cron dispatch (cron.py:130) and webhook dispatch
(dispatcher.py:dispatch_webhook + dispatch_action). Recording is best-effort
— OSError writes log `triggers.history.write_failed` and swallow.

`/stats` appends `(N triggered, M manual)` per engine line and on the totals
row when at least one count > 0. `DayBucket`/`AggregatedStats` carry additive
`triggered_count`/`manual_count` with `.get(..., 0)` fallbacks so existing
stats.json files load cleanly. `runner_bridge.handle_message` resolves the
split via `triggered=bool(context and context.trigger_source)`.

28 new tests: 10 in test_triggers_history.py (round-trip, corrupt JSON,
version mismatch, persistence), 7 in test_session_stats.py (triggered/manual
split, back-compat with old format), 3 in test_stats_command.py (breakdown
present/omitted/totals), 7 in test_config_command.py::TestTriggersPagePerChat
(crons listed, webhooks listed, chat filtering, default_chat_id fallback,
last-fired rendering, overflow cap), 2 in test_trigger_cron.py (cron firing
records last_fired + history failure resilience), 2 in
test_trigger_dispatcher.py (webhook records last_fired + history failure
resilience). Full suite: 2496 passed, coverage 82.18%.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…) (#446)

After a Claude bidirectional session emits `result`, the CLI keeps stdin
open so multi-turn sessions don't re-spawn. In practice this leaves a
400 MB RSS subprocess + ~200 TCP sockets idling for 30+ minutes between
prompts, and from the user's perspective the session looks "stuck" —
final message rendered, no further indication of state.

Option D hybrid:
- New `[watchdog].post_result_idle_enabled = true` (kill switch) and
  `[watchdog].post_result_idle_timeout = 600.0` (30s–1h) in settings.
- `ClaudeStreamState.result_received_at` armed by `translate_claude_event`
  on every `StreamResultMessage` (re-armed per turn so multi-turn works).
- New `ClaudeRunner._post_result_idle_watchdog` task runs in the existing
  `run_impl` task group when `use_control_channel` is True. Polls the
  timer; when the deadline passes, calls `this_proc_stdin.aclose()`
  (same mechanism as the normal-flow exit at line 2412, just earlier).
  CLI hits stdin EOF and exits gracefully (rc=0).

- Auto-continue safety: the existing `_should_auto_continue` gate
  excludes `last_event_type == "result"` (locked by
  `test_skips_result_event_type` in test_exec_bridge.py), so the clean
  rc=0 exit will not phantom-resume the session.
- Approval-state guard: if `_REQUEST_TO_SESSION` or `_PENDING_ASK_REQUESTS`
  has live entries for this session, defer the close (re-arm the timer)
  to avoid orphaning a button-click control_response in flight.

UX hint #1: a supplementary `StartedEvent` with `meta={"complete":
"✓ turn complete"}` is emitted alongside `CompletedEvent` on successful
results (the supported pattern for late-arriving meta per
runner-development.md). `markdown.format_meta_line` renders it in the
footer so the user sees the turn boundary immediately. Errored results
don't get the hint (no false "complete" tag on a failure).

Two structlog events for ops:
- `claude.post_result_idle.deferred` — approval guard suppressed close
- `claude.post_result_idle.closing_stdin` — deadline passed, stdin closed

7 new tests in test_claude_runner.py: result-event arms timer, emits
turn-complete meta, skips meta on error, watchdog fires when clean,
watchdog defers when pending approval, format_meta_line renders the hint
when present and omits it when absent. Full suite: 2503 passed.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#447)

Closes #269. The four settings groups in the issue had different states:
- [footer]: already loads fresh per-message via _load_footer_settings (no work)
- [cost]: already loads fresh per-call inside _check_cost_budget (no work)
- [watchdog]: already loads fresh per-run via _load_watchdog_settings at the
  top of handle_message (no work — verified, applies on next run)
- [progress]: was baked in at startup via MarkdownFormatter constructor +
  ExecBridgeConfig.min_render_interval — this PR closes that gap

Changes:
- markdown.py: new MarkdownFormatter.refresh_from(progress_settings) updates
  max_actions + verbosity from a fresh ProgressSettings snapshot. Tolerates
  missing/invalid attributes (clamps negative max_actions to 0; ignores
  unknown verbosity values).
- telegram/bridge.py: new TelegramPresenter.refresh_progress_settings()
  delegates to formatter.refresh_from.
- runner_bridge.py: new _load_progress_settings() sibling of
  _load_footer_settings / _load_watchdog_settings; handle_message reads it
  fresh per-run, calls cfg.presenter.refresh_progress_settings(...) via
  duck-typed getattr (Presenter is a Protocol, so we don't add to it), and
  threads progress_cfg.min_render_interval into each ProgressEdits instance
  instead of the startup snapshot. Per-chat /verbose overrides downstream
  of _resolve_presenter reconstruct from the refreshed defaults.

Out of scope (entry-point limitation): engine + command registration still
require pipx upgrade / restart. Documented on the issue.

8 new tests in tests/test_meta_line.py: TestMarkdownFormatterRefresh covers
max_actions update, verbosity update, negative clamp, invalid-verbosity
rejection, missing-attribute tolerance, presenter delegation. Plus
_load_progress_settings defaults / error-fallback. Full suite: 2511 passed.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All 9 v0.35.3 Group 2 issues now landed on dev:

- #404 — secret-scanning alert (PR #439)
- #297 — /trigger → /listen rename + alias (PR #440)
- #294 — master trigger pause/resume toggle (PR #441)
- #380 — auto-approve scope review (PR #442)
- #438 — claude_stream_idle_timeout_ms + Type-A/B classification (PR #443)
- #410 — subscription usage observability + /usage debug (PR #444)
- #271 — trigger visibility Tier 2 + Tier 3 (PR #445)
- #333 — Claude post-result idle timeout + ✓ turn complete UX hint (PR #446)
- #269 — hot-reload [progress] settings (PR #447)

Bumps to TestPyPI for staging via @hetz_lba1_bot once integration tests
U1-U7 pass against @untether_dev_bot.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps [dependabot/fetch-metadata](https://github.com/dependabot/fetch-metadata) from 2.5.0 to 3.1.0.
- [Release notes](https://github.com/dependabot/fetch-metadata/releases)
- [Commits](dependabot/fetch-metadata@21025c7...25dd0e3)

---
updated-dependencies:
- dependency-name: dependabot/fetch-metadata
  dependency-version: 3.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.4.0 to 8.1.0.
- [Release notes](https://github.com/astral-sh/setup-uv/releases)
- [Commits](astral-sh/setup-uv@6ee6290...0880764)

---
updated-dependencies:
- dependency-name: astral-sh/setup-uv
  dependency-version: 8.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 7.0.0 to 7.0.1.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](actions/upload-artifact@bbbca2d...043fb46)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: 7.0.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.32.6 to 4.35.2.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@820e316...95e58e9)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.2
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
#471 + #271) (#472)

* fix(at): stamp at:<token> trigger_source on /at-scheduled runs (#271)

Mirror the cron:<id> / webhook:<id> footer markers added in #271 (rc4)
and Tier 2/3 (rc5) so /at-scheduled runs also show provenance.

at_scheduler.schedule_delayed_run wraps the captured chat context (or a
fresh RunContext when the chat is unmapped) with trigger_source =
"at:<token>" via dataclasses.replace. runner_bridge.handle_message's
icon-prefix tuple extends from ("cron:",) to ("cron:", "at:") so the
alarm-clock icon renders for both — semantically /at is a one-shot
delayed cron. record_run's existing triggered=bool(context and
context.trigger_source) gate picks up /at runs in the /stats
triggered/manual breakdown automatically.

Tests: 1 new in test_at_command.py
(test_handle_stamps_trigger_source_on_mapped_chat); the existing
test_handle_captures_global_default_when_unmapped extended to assert
the trigger_source-only RunContext path; existing
test_run_delayed_forwards_captured_context_and_engine updated since
the captured context is no longer reference-equal to the original.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(gemini): pass --skip-trust by default for headless runs (#471)

Gemini CLI rejects runs from any directory not in
~/.gemini/trustedFolders.json — even with --approval-mode yolo — and
there is no interactive prompt path in headless usage, so projects
outside the trust list silently failed before any agent output.

Untether already runs Gemini with yolo for the same "always headless"
reason, so passing --skip-trust extends the same precedent.
GeminiRunner.skip_trust (default True) is the runtime switch; opt out
per deployment with [gemini] skip_trust = false in untether.toml
(security-conscious operators who want Gemini's project-local
extension/MCP trust gate enforced).

Bump to 0.35.3rc6 for staging.

Tests: 2 new in test_build_args.py::TestGeminiBuildArgs
(test_skip_trust_default_includes_flag,
test_skip_trust_opt_out_omits_flag).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sing feature coverage (#473)

Audited every issue in the v0.35.3 milestone (26 issues) against the
full repo documentation surface and closed the gaps. Reference issues
covered: #205, #206, #207, #208, #211, #213, #269, #271, #294, #297,
#333, #377, #378, #379, #380, #402, #403, #407, #409, #410, #438, #471.

CHANGELOG.md
- Added missing entry for #297 (/trigger → /listen rename) under
  ### changes. The other "milestone" issues (#224, #228, #239) were
  closed against v0.35.3 for tracking only — their fixes shipped in
  v0.35.0/v0.35.1rc2; per the repo's "no retroactive edits to prior
  sections" rule, they remain undocumented in CHANGELOG (closure
  comments cite the actual versions).

/trigger → /listen rename sweep (#297)
- README.md: command table row, group-chat link
- docs/reference/commands-and-directives.md: command row
- docs/reference/transports/telegram.md: command list + admin note
- docs/reference/integration-testing.md: O3 + Q12 test rows
- docs/explanation/routing-and-sessions.md: pre-routing filter section

Runner specs
- gemini/runner.md: --skip-trust default + opt-out via [gemini]
  skip_trust = false (#471)
- claude/runner.md: post-result idle watchdog + "✓ turn complete"
  meta hint (#333), claude_stream_idle_timeout_ms config + Type-A/B
  classifier (#438)

How-to guides
- schedule-tasks.md: trigger provenance + history + /stats
  triggered/manual breakdown (#271 Tier 3); master pause/resume
  toggle (#294)
- inline-settings.md: new Triggers page (#271 Tier 2 + #294)
- troubleshooting.md: Type-A/B stream idle classification (#438);
  post-result idle watchdog + ✓ turn complete (#333)
- security.md: extended path-redaction coverage (#208); Pi session
  dirs 0o700 (#207)
- subscription-usage.md: /usage debug section (#410)
- operations.md: pause status surfacing in /health (#294); /usage
  debug cross-link (#410); expanded hot-reload list to include
  [progress] (#269), [watchdog] (#333, #438), [footer], [cost]

README.md
- Scheduled tasks bullet: pause/resume toggle (#294); footer
  provenance markers (#271 Tier 3); /stats triggered/manual split
- Inline settings bullet: 📡 Triggers page (#271, #294)
- Commands table: /usage debug (#410); /listen (#297); /config
  Triggers page row

Verified clean:
- python3 scripts/validate_release.py (rc6 pre-release)
- grep -rnE "/trigger\\b" docs/ README.md returns zero non-deprecation
  hits in production docs (test plans and historical results retain
  /trigger by design)
- Cross-references resolve to existing anchors

Plan: ~/.claude/plans/untether-you-are-running-rustling-shannon.md
(also staged in .untether-outbox/v0.35.3-doc-audit-plan.md)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.35.2 to 4.35.3.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@95e58e9...e46ed2c)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…) + local-context protection (#479)

* fix(security): claude runner.start no longer leaks prompt at INFO (#478)

The Claude runner's run_impl override at src/untether/runners/claude.py
had its own duplicate runner.start log call that was missed when the
base runner was fixed for #205. Every Claude session emitted
`prompt=prompt[:100] + "…"` at INFO level — leaking the first ~100
chars of the Untether preamble (boilerplate, but spec-violating).
Discovered during the v0.35.3 follow-up E2E pass.

Fix mirrors the base runner impl:
- INFO `runner.start`: only `engine`, `resume`, `prompt_len`, `args`
- DEBUG `runner.start_prompt`: preview of first 100 chars (opt-in)

Argv redaction also tightened:
- env -i KEY=VAL pairs redacted via redact_env_i_args (was already
  applied at subprocess.spawn but not at runner.start, so e.g.
  BWS_ACCESS_TOKEN, GEMINI_API_KEY values would land in INFO logs)
- Legacy-mode (no permission_mode) `-- <prompt>` tail collapsed to
  `-- <prompt redacted>` so prompt content never reaches INFO under
  any code path

2 new regression tests cover both control-channel and legacy modes:
- test_runner_start_does_not_log_prompt_at_info
- test_runner_start_redacts_legacy_mode_prompt_in_args

Closes #478.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(faq): add docs/faq/index.md for help-centre FAQPage schema (#477)

Marketing-site infra (FAQPage extractor on
`feature/help-seo-geo-items-1-4` in littlebearapps/littlebearapps.com)
already extracts question-shaped H2s and emits Schema.org FAQPage
JSON-LD on any help article with `category: faq` frontmatter or ≥3
question-shaped H2s. No tool currently has a dedicated FAQ scaffold;
this commit closes the loop for Untether.

The new file lives at docs/faq/index.md (Diátaxis-aligned scaffold —
plain title + description frontmatter, marketing-site sync injects
category/tool/dates). 12 question-shaped H2s exceed the 7-minimum
acceptance criterion:

  1. What is Untether?
  2. How do I install Untether?
  3. Which AI coding agents does Untether support?
  4. Do I need an API key to use Untether?
  5. Where does my code and data go?
  6. How do I approve tool calls from my phone?
  7. What happens if my agent crashes or my phone loses signal mid-run?
  8. How do I keep agents from spending too much money?
  9. Can I send voice notes instead of typing?
  10. How do I update Untether?
  11. How do I uninstall Untether?
  12. Where can I get help or report a bug?

Each answer is a complete paragraph (no TODO / placeholder), sourced
from README + real common-channel topics. Cross-links to existing
help-guide URLs preserve nav chains.

Coordinated mapping in `littlebearapps/littlebearapps.com`
(`scripts/docs-sync.config.ts` → add `untether` → `docs/faq` →
`category: faq`) is a separate one-line PR per the issue's
"Coordinated mapping" section. Once both land, the next nightly sync
surfaces the FAQ at <https://untether.littlebearapps.com/help/untether/faq/>
with a visible `<script type="application/ld+json">` FAQPage block,
unlocking AI-citation surface (ChatGPT, Perplexity, Google AI
Overviews) and SERP rich-snippet eligibility.

Closes #477.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ctx: protect docs/faq/index.md from deletion + register in local docs (#477)

The FAQ doc is part of the marketing-site FAQPage Schema.org pipeline
(littlebearapps/littlebearapps.com:scripts/docs-sync.config.ts → untether
→ category: faq). Removing it silently breaks the docs-sync mapping and
regresses AI-citation surface. This commit hardens local Claude Code
context so the file:

  - cannot be silently deleted, moved, or truncated by accident
  - has explicit guidance on when/how to update it during releases
  - is registered in CLAUDE.md so future contributors know it exists

Changes:

* `.claude/hooks/help-faq-protect.sh` (new) — PreToolUse Bash hook
  blocking `rm`, `git rm`, `mv`-away, and shell `>` truncation
  targeting `docs/faq/index.md`. Edits via Edit/Write/append `>>` are
  intentionally allowed — the FAQ is meant to evolve. Smoke-tested
  with 7 synthetic inputs covering both deny and allow paths.

* `.claude/hooks/release-guard-protect.sh` (updated) — also protects
  `help-faq-protect.sh` from being weakened or removed via Edit/Write.

* `.claude/hooks.json` (updated) —
  - registers help-faq-protect.sh under PreToolUse Bash
  - extends the existing Edit/Write context-prompt with a docs/faq/*
    branch (HELP-FAQ CONTEXT) reminding contributors of question-shape
    rules and the maintain-as-features-land cadence
  - extends the version-bump-checklist (PostToolUse) with an FAQ
    touch-up step

* `.claude/rules/help-faq.md` (new) — auto-loads when editing
  `docs/faq/**`. Documents the hard rules (NEVER delete; MUST update
  with feature changes), soft conventions (question-shaped H2, ≥7
  Q/A, real behaviour not aspirational), and the release-cadence
  workflow.

* `.claude/rules/release-discipline.md` (updated) — adds an FAQ
  touch-up step to the version-bump checklist.

* `CLAUDE.md` (updated) —
  - new "Help-centre FAQ" section after "Documentation screenshots"
    explaining the file's role and the no-deletion rule
  - Hooks table registers `help-faq-protect`
  - Rules table registers `help-faq.md`

Refs #477.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps pre-release version so TestPyPI can publish a fresh wheel that
includes the v0.35.3 follow-up bundle merged via PR #479:
  - fix(security): claude runner.start no longer leaks prompt at INFO (#478)
  - docs(faq): add docs/faq/index.md for help-centre FAQPage schema (#477)
  - ctx: protect docs/faq/index.md from deletion + register in local docs (#477)

The rc6 wheel on TestPyPI predates this work — without the bump the
publish step skips ("File already exists") and the staging upgrade path
keeps installing the older wheel.

Per release-discipline.md, pre-release versions don't require a
CHANGELOG entry (validate_release.py skips them) and aren't tagged
(auto-tag-on-master.yml skips pre-releases).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#481) (#484)

Two coordinated fixes that share the same `progress_edits.stall_detected`
decision branch in `runner_bridge.py`. Reproduction: a 45-min Claude
session on staging looked hung — 10-min Cloudflare deploy poll + 14-min
approval-keyboard wait kept the chat silent, then surfaced unhelpful
stall warnings during legitimate waits.

#470 — Post-result stall suppression + closing message
- New `progress_edits.stall_post_result_suppressed` info log when
  `stream.last_event_type == "result"` and the post-result idle
  watchdog (#333) is the legitimate owner of the silence
- Auto-cancel `_STALL_MAX_WARNINGS` arm gated by the same boolean —
  no more SIGTERM'ing sessions that are about to gracefully close
- Watchdog stamps `ClaudeStreamState.post_result_closed_at` before
  `aclose()`; bridge's heartbeat tick sends a one-shot
  `✓ turn complete · session closed after Nm idle` message
  (idempotency via `post_result_closing_sent` flag)

#481 — Long-tool visibility + suppression matrix
- New `[progress] heartbeat_interval` (default 30 s) drives a tick
  inside `_stall_monitor` that bumps `event_seq` whenever any open
  action is older than 60 s, forcing a re-render with a fresh
  elapsed-time tail
- `format_action_line` gained `elapsed_seconds` kwarg; non-completed
  actions > 60 s render as `▸ Bash · 3m 47s · npm run build`,
  regardless of `/verbose` toggle
- `format_verbose_detail` gained `BashOutput` (renders last line of
  `result_preview` so polling loops show live stdout), `KillShell`,
  `ScheduleWakeup` (countdown + reason), and `Monitor` (countdown)
  branches
- `ActionState` gained `started_at` / `last_update_at` wall-clock
  fields populated from the new `ProgressTracker.clock` callable
- `MarkdownFormatter.render_progress_parts` / `MarkdownPresenter` /
  `Presenter` Protocol / `TelegramPresenter` all gained `now: float | None`
  threaded from `runner_bridge._run_loop`
- New `format_duration` / `format_countdown` helpers
- Five new suppression branches in `_stall_monitor`, gated by
  `not frozen_escalate` so genuinely-frozen sessions still warn:
  - stall_post_result_suppressed (#470)
  - stall_schedule_wakeup_suppressed (engine_state.live_wakeups)
  - stall_monitor_active_suppressed (engine_state.live_monitors)
  - stall_bash_grace_suppressed (new `[watchdog] bash_grace_seconds`,
    default 60 s)
  - stall_long_bash_suppressed (BashOutput within stall_threshold/2)

Bonus fix: `_register_background_handle` now reads `delaySeconds` first
(per upstream Claude Code schema, #289) instead of only `delay_ms` —
production deadlines were always 0.0, breaking countdown rendering.
Backward-compat fallback to `delay_ms`/`timeout_ms` preserved.

structlog WARN events at runner.py and runner_bridge.py are unchanged
so untether-issue-watcher and ops dashboards continue to receive the
underlying signals — only the chat-side surfacing decision changed.

Tests: 32 new (11 in test_exec_bridge.py for suppression branches,
auto-cancel gating, frozen-ring precedence, closing-message
idempotency, heartbeat countdown mutation; 3 in test_claude_runner.py
for delaySeconds + post-result state init; 18 in test_verbose_progress.py
for new tool detail branches, format_duration helpers, long-running
tail). Full suite: 2548 passed, 82.26% coverage.

Integration tests: U3 (basic Claude Code) passes cleanly via
@untether_dev_bot — 33 s run, zero stall warnings, "✓ turn complete"
footer rendered. Long-running BashOutput-polling and 30-min
genuinely-frozen tests deferred to staging dogfood.

Out of scope / known constraints:
- Strict 5 s rolling Bash stdout sub-line is not achievable without
  upstream Claude Code interim tool_result deltas. The BashOutput
  polling path is the proxy and refreshes at each polling cycle
  (~15 s in practice).
- ScheduleWakeup countdown rendering depends on #289 (`/loop`
  interception) for the timer to actually fire; suppression of stall
  warnings while a wakeup is pending works today.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(loop): add LoopSettings + EngineOverrides.loop_enabled (#289)

Foundation for /loop and ScheduleWakeup support — Untether-side
observation of Claude Code's session-scoped scheduling tools so
loops keep firing after the subprocess exits.  Default OFF — opt-in
per-chat via /config → 🔁 Loop mode.

src/untether/settings.py — new LoopSettings model:
  enabled (default false), inline_threshold_seconds (300),
  redundancy_check_interval (30), max_iterations (20),
  max_total_duration_hours (4), min_interval_seconds (60),
  expiry_days (7).  Cost limits stay in [cost_budget] —
  the caps in [loop] are runaway-safety only.

src/untether/telegram/engine_overrides.py — new loop_enabled field
on EngineOverrides struct, threaded through normalize_overrides()
and merge_overrides() following the existing budget_enabled pattern.
LOOP_SUPPORTED_ENGINES = frozenset({"claude"}) — Claude-only since
other engines don't expose CronCreate / ScheduleWakeup.

Tests: 7 new in test_settings.py (defaults, TOML round-trip, bounds,
unknown-key rejection); 5 new in test_telegram_engine_overrides.py
(default None, merge topic/chat priority, ChatPrefsStore round-trip,
LOOP_SUPPORTED_ENGINES constant).  76 tests pass across the changed
files.

Empirical pre-work in this session:
  Probe 4 + 4b — hanging tool_use(AskUserQuestion) does NOT cause
  catastrophic resume behaviour; outcome (c) confirmed.  Drops the
  consensus-mandated interactive-state gate from PR1 scope.
  Probe 5 — CronCreate uses field "cron" (not "cron_expression");
  CronDelete takes id; CronList renders one entry per line as
  "<8hex> — <human-schedule> (recurring|one-off) [session-only]: <prompt>".
  Dispatcher rename — Telegram management surface will be /loops
  (PLURAL) so /loop (singular) keeps passing through to Claude;
  the dispatcher in telegram/loop.py:2256–2300 matches first-word
  only and either fully intercepts or never.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(loop): add is_session_alive helper to claude runner (#289)

loop_scheduler._fire (PR1) needs a cheap "is the subprocess for this
session_id currently running?" check before firing a loop iteration.
Spawning claude --resume against an alive subprocess would race the
in-flight turn and almost certainly violate session locking.

src/untether/runners/claude.py — new module-level is_session_alive(sid)
that reads membership of the existing _SESSION_STDIN registry.  The
registry is populated when a runner spawns its subprocess and cleared
in the run_impl finally block, so membership is the canonical signal
of "subprocess is up right now."

Tests: 2 in test_claude_runner.py (membership round-trip with cleanup,
unknown session returns False).

* feat(loop): add loop_scheduler module with persistence + tests (#289)

Untether-side scheduler for /loop and ScheduleWakeup. Mirrors
at_scheduler.py shape: 4 install globals + _PENDING dicts + install/
uninstall API. Adds:

- _LoopEntry dataclass with fallback_first_user_message (text, not
  msg id — Gap 4 of the handover) for the <<autonomous-loop-dynamic>>
  sentinel fallback path.
- register_pending_cron / register_pending_wakeup / bind_upstream_id
  for the observer hooks (wired in a follow-up commit — this commit
  is foundation only).
- cancel_by_token / cancel_by_upstream_id / cancel_pending_for_chat
  with do-not-resume sentinel write on user cancel.
- _fire path with race-avoidance (is_session_alive lazy import),
  drop-on-busy, max-iterations / max-total-duration / 7-day expiry caps,
  re-issue prompt wrap "Loop iteration N: ... do the task now; do not
  summarize old results unless necessary." (Probe 3 + consensus).
- Generation counter + cancel_event so old _arm_timer tasks left over
  from a previous round detect they are stale and bail out instead of
  double-firing on the new round's scope.
- Atomic JSON persistence to active_loops.json (sibling to config) via
  utils.json_state.atomic_write_json. Restart resilience: past
  fire_at_wallclock fires immediately (no catch-up multiplier),
  cancelled entries skipped on reload, do-not-resume sentinel persists.
- Cron next-fire computation via existing triggers.cron.cron_matches
  (5-field expressions, 366-day horizon).

41 unit tests covering: install/uninstall lifecycle, registration
(cron + wakeup with sentinel fallback), upstream-ID binding,
cancellation paths, inspection helpers, cron parsing edge cases,
fire path (cancelled / max-iter / do-not-resume / busy / race-alive /
success / sentinel-fallback / one-shot expiry), persistence round-trip,
restart resume + skip-cancelled, do-not-resume across restart, corrupt
file handling, persistence-disabled mode.

Coverage of loop_scheduler.py: 84% (above 80% threshold).

NOT WIRED YET — observers in runners/claude.py and drain integration in
telegram/loop.py land in subsequent commits per the v0.35.4 PR1 plan.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(loop): observe CronCreate / ScheduleWakeup / CronDelete in claude runner (#289)

Wires the loop_scheduler module into the JSONL stream-translation path.
Observers run as siblings of (not replacements for) the existing
_register_background_handle / _clear_background_handle hooks at lines
~1028 and ~1090.

Changes:

- src/untether/runners/run_options.py: add `loop_enabled: bool | None`
  to `EngineRunOptions` so the per-chat /config → 🔁 Loop mode toggle
  can short-circuit observers via the existing run-options contextvar.
- src/untether/telegram/loop.py: plumb `loop_enabled` from merged
  EngineOverrides into the resolved EngineRunOptions.
- src/untether/runners/claude.py:
  - `ClaudeStreamState.first_user_message_text` (str | None) — populated
    from the `prompt` arg in `new_state` so loop entries can fall back
    to it when ScheduleWakeup observes the
    `<<autonomous-loop-dynamic>>` sentinel (Probe 3 result).
  - `_loop_enabled_for_chat(chat_id)` — resolves per-chat run-options
    override → global `[loop] enabled` → False fallback. Sync (no async
    prefs lookup; the contextvar is set upstream by executor.py).
  - `_observe_loop_tool_use(state, content)` — handles CronCreate /
    ScheduleWakeup / CronDelete tool_use blocks. Uses the canonical
    field names (`cron`, not `cron_expression`; `id`, not `taskId`)
    confirmed by Probe 5. Skips ScheduleWakeup when `delaySeconds` is
    at or below `[loop] inline_threshold_seconds` so short waits stay
    rendered live by the rc8 countdown.
  - `_observe_loop_tool_result(state, tool_use_id, content)` — parses
    `\bjob ([0-9a-f]{8})\b` from CronCreate result text and binds the
    upstream cron ID via `loop_scheduler.bind_upstream_id`.
  - Calls wired at the existing tool_use / tool_result decode sites
    inside `translate_claude_event`. Master-toggle gate sits at the
    top of the observers so OFF behaviour is identical to today.
- tests/test_claude_runner.py: new `TestLoopObservation` class (10
  tests) covering chat-id-unset no-op, master-toggle off, CronCreate
  registration, `cron` vs `cron_expression` field precedence, missing
  prompt rejection, ScheduleWakeup above/below threshold, CronDelete,
  upstream-ID binding, and `_loop_enabled_for_chat` resolution. Plus
  one sync test for `first_user_message_text` capture in `new_state`.

All 2615 tests pass. Loop_scheduler observer wiring is now live —
PR1 still default OFF; per-chat toggle UI lands in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(loop): add /config:loop sub-page + home-page button (#289)

The Loop mode toggle is the user-facing master gate for /loop and
ScheduleWakeup observation.  Default OFF — opt-in per chat with an
explicit cost+quota warning before turning ON.

- New `_page_loop()` mirroring `_page_planmode()` shape: tri-state
  per-chat override (On / Off / Clear → fall back to global
  `[loop] enabled`), HTML body explaining behaviour ON vs OFF, "💰 Set
  a budget" deeplink to `config:cu` for one-tap budget setup before
  enabling.
- Engine-aware: only renders for `LOOP_SUPPORTED_ENGINES = {claude}`;
  shows "Only available for Claude Code" message on other engines.
- Home page (Claude only): replace the previous Plan-mode + Engine
  layout to slot in `🔁 Loop mode` next to `📡 Listen`, push
  `⚙️ Engine & model` next to `🧠 Effort`, and break `ℹ️ About` onto
  its own row.  Codex / OpenCode / Pi / Gemini / AMP home pages are
  unchanged — no `config:loop` callback rendered.
- Toast labels for `loop:on`/`loop:off`/`loop:clr` callbacks so
  early-answer dispatch shows confirmation immediately.
- 7 new tests in `TestLoopMode`: page renders with toggle + cost
  warning + budget deeplink, hidden for non-Claude, set-on returns
  home, clear resets per-chat override, no-config-path branch,
  home-page button visibility (Claude vs Codex).

All 240 config_command tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(loop): drain integration + /cancel + /new wiring (#289)

Safety-critical wiring so loops survive shutdown cleanly and respond to
user-initiated cancellation.

- src/untether/telegram/loop.py:
  - Install `loop_scheduler` immediately after `at_scheduler`.  Resolve
    `state_path` from `cfg.runtime.config_path.with_name("active_loops.json")`
    so loop state is persisted alongside `last_update_id.json` and
    `active_progress.json`.
  - Wire an `is_chat_busy(chat_id)` callable that scans `running_tasks`
    for refs in the chat — `loop_scheduler._fire` consults it to drop
    iterations when the chat already has a run in flight (mirrors
    upstream's "no catch-up" semantic).
  - Drain integration: `_drain_and_exit` now logs `pending_loops` from
    `loop_scheduler.active_count()` alongside `pending_at`.  The
    task-group cancel propagates into `_arm_timer` sleeps cleanly via
    the cancel-event primitive added in Commit A.
- src/untether/telegram/commands/cancel.py:
  - `handle_cancel` now also drops pending /loop entries for the chat
    when there's no specific reply target.  Reports
    "❌ cancelled N active loops" alongside the existing /at handling.
  - `cancel_pending_for_chat` writes the do-not-resume sentinel for
    each cancelled loop's session_id (handover default — block only
    `loop_scheduler --resume`, NOT `/continue`).
- src/untether/telegram/commands/topics.py:
  - `_cancel_chat_tasks` (called by `/new`) drops loop entries too so
    the "wipe a chat's state" semantics are complete.

All 2622 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(loop): document Loop mode + cost interaction (#289)

Five doc files updated as the user-facing surface for Loop mode (default
OFF, opt-in per chat).

- docs/how-to/schedule-tasks.md:
  - New intro callout below H1 stating Loop mode is opt-in and pointing
    to the new section.
  - New "## Loop mode" section between /at and Telegram scheduling
    explaining the observe-and-fire-on-resume architecture, runaway
    caps, cost considerations (cache-warm vs cold per-fire ranges),
    cancel + persistence semantics.
- docs/how-to/cost-budgets.md:
  - Warning callout after "Per-chat overrides" — loop fires count
    toward the same daily/per-run caps; set a budget BEFORE turning
    Loop mode on.
- docs/how-to/troubleshooting.md:
  - New "Loop didn't fire / loop fired too many times" symptom table:
    toggle off, max_iterations, daily_budget_exceeded, "fresh user
    turn" expected behaviour, stale active_loops.json, restore failures.
- docs/faq/index.md:
  - New H2 "Does /loop work via Untether?" answering the most-asked
    expected question. Verifies against .claude/rules/help-faq.md:
    13 H2s (above floor of 7), all question-shaped, no TODOs.
- docs/reference/config.md:
  - New `[loop]` section between `[watchdog]` and `[auto_continue]`
    documenting all 7 config keys plus the explicit "cost limits are
    NOT in [loop]" pointer to [cost_budget].

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: changelog entry for /loop + ScheduleWakeup support (#289)

v0.35.4 (unreleased) entry summarising the multi-commit Loop-mode work
landed under #289.  Validation passes (pre-release suffix on
pyproject.toml means validate_release.py skips the strict checks; the
entry is forward-looking for the eventual stable release).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: re-target loop-mode PR to v0.35.3rc9 (#289)

Per Nathan's correction — the /loop and ScheduleWakeup work lands
inside the v0.35.3 milestone train as the next staging rc
(0.35.3rc9), not v0.35.4 as the original handover suggested.  Issue
#289 was already correctly milestoned to v0.35.3 on GitHub.

- pyproject.toml: 0.35.3rc8 → 0.35.3rc9
- uv.lock: re-synced
- CHANGELOG.md: fold the loop-mode entries from a forward-looking
  v0.35.4 (unreleased) block into the existing v0.35.3 (unreleased)
  block (### changes + ### docs subsections)
- docs/how-to/schedule-tasks.md: drop the stray "pre-v0.35.4" version
  string from the intro callout (use "prior-version baseline" instead
  so the prose doesn't drift on each rc)

No code or test changes — full suite still 2622 passed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: unblock dev CI — ruff SIM300 + new pip CVE ignore

Two pre-existing CI failures already on dev's last run (acb6ec0).
Both fixes are tiny and unrelated to loop scope:

- tests/test_telegram_engine_overrides.py:235 — apply ruff's suggested
  rewrite of the SIM300 Yoda-condition assertion (semantically
  identical; literal on the left now).
- .github/workflows/ci.yml:210 — add CVE-2026-6357 to the pip-audit
  ignore list.  pip 26.0.1 has the CVE; fix is pip 26.1 which the uv
  tooling hasn't pulled yet.  Sibling of the existing CVE-2026-3219
  ignore from the same audit pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lanation (#486)

Audited all v0.35.3 user-facing changes against the doc surface and
applied succinct updates to fill the gaps left after rc1–rc9. No
content rewrites — additive updates only.

Reference docs:
- config.md: [progress] heartbeat_interval (#481), [watchdog]
  post_result_idle_timeout / post_result_idle_enabled (#333) +
  bash_grace_seconds (#481), [gemini] skip_trust (#471), hot-reload
  tip on [progress]
- commands-and-directives.md: heartbeat tail note, Loop-mode pointer
- triggers/triggers.md: /config:tg page, /stats triggered/manual,
  /at footer (⏰ at:<token>), 503 paused response, /health paused,
  full Pause/Resume section
- specification.md: version stamp v0.35.1 → v0.35.3
- runners/claude/stream-json-cheatsheet.md: ScheduleWakeup event
  shape (delaySeconds, reason, prompt) + CronCreate notes for
  the Loop observer (#289, #481)
- runners/claude/untether-events.md: supplementary StartedEvent
  with meta={"complete": "✓ turn complete"} after successful result
- runners/amp/untether-events.md: example flipped to
  dangerously_allow_all = false (default since #206)
- runners/pi/runner.md: 0o700 session dir mode (#207)

How-to:
- inline-settings.md: 🔁 Loop mode page section (Claude only,
  cost+quota warning, 💰 Set a budget deeplink)
- verbose-progress.md: long-running tool tail (heartbeat),
  BashOutput/ScheduleWakeup/Monitor verbose detail, hot-reload tip
- webhooks-and-cron.md: full Pause and resume section, /health
  paused state, 503 triggers paused
- troubleshooting.md: post-result closing message + stall
  suppression note
- operations.md: hot-reload list now covers heartbeat_interval +
  bash_grace_seconds + trigger pause/resume
- tutorials/install.md: Gemini --skip-trust headless tip

Explanation:
- architecture.md: TriggerManager pause/resume + triggers/history.py
- module-map.md: at_scheduler.py, loop_scheduler.py, full
  Triggers module section

Top-level:
- README.md: 🔁 Autonomous loops feature row
- SECURITY.md: v0.35.3 hardening bundle subsection covering #377,
  #206, #207, #378, #205/#478, #211, #208, #213, #379, #402, #380,
  #409

No code or test changes. The rename portion of #483
(docs/faq/index.md → docs/faq/faq.md) is deliberately deferred —
the FAQ-protect hook (#477) blocks the move and is self-protected
by release-guard-protect.sh, so the rename + hooks updates need
manual unblock from a human.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #483. Renames the FAQ file so the help-centre URL becomes
`/help/untether/faq/` instead of `/help/untether/index/`. The
marketing-site docs-sync derives its slug from the filename, so
this rename + the matching mapping update on
`littlebearapps/littlebearapps.com` are what produce the cleaner
URL. AI-citation surface (ChatGPT, Perplexity, Google AI Overviews)
is unaffected — the FAQPage JSON-LD schema is what they read,
URL doesn't matter to them.

What changed in this PR:
- `git mv docs/faq/index.md docs/faq/faq.md` (no content changes)
- `.claude/hooks/help-faq-protect.sh` rewritten to protect the new
  filename (Bash heredoc, since the file is self-protected from
  Edit/Write — temporarily disabled the hook to do the git mv,
  rewrote with faq.md content, restored)
- `.claude/rules/help-faq.md` — sweep to faq.md
- `.claude/rules/release-discipline.md` — sweep
- `CHANGELOG.md` — historical reference + #483 link added
- `CLAUDE.md` — both occurrences updated; historical reference kept

Out of scope (need manual touch from Nathan — Claude Code is
blocked by release-guard-protect.sh):
- `.claude/hooks.json` — two prompt-text occurrences of
  `docs/faq/index.md` (lines 49, 69). Non-functional (prompt
  guidance only), but cosmetically stale.
- `.claude/hooks/release-guard-protect.sh` — line 31 deny() error
  message references `docs/faq/index.md`. Also non-functional,
  cosmetically stale.

Both protected files only carry stale text that has zero effect
on hook behaviour. The functional FAQ-protect hook itself now
correctly protects `docs/faq/faq.md`.

Marketing-site follow-up (separate PR over there):
- Update `scripts/docs-sync.config.ts` `untether → docs/faq`
  mapping to track the new filename.
- 301 redirect from `/help/untether/index/` → `/help/untether/faq/`
  for backlink hygiene.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_use schema gap (#489) (#490)

* fix: AskUserQuestion multi-question text-reply path no longer crashes Untether (#488)

Observed live on staging (@hetz_lba1_bot, v0.35.2) on 2026-05-08 06:43:11 UTC:
unhandled TypeError in route_message kills the entire process when the user
answers question 1 of N via the "Other" → text-reply path.

The buggy path constructed a RenderedMessage for the next question's option-
button keyboard and passed it to a send_plain partial whose text: kwarg
expects str, raising:

    TypeError: sequence item 0: expected str instance, RenderedMessage found

inside markdown.assemble_markdown_parts. systemd auto-restarted in ~10s and
offset_persistence.py prevented Telegram update loss, but ALL active runs
across all chats were lost.

Refactor: extract the multi-question continuation logic into a module-level
helper send_next_ask_question_message in telegram/commands/ask_question.py
that calls transport.send directly with a RenderedMessage carrying HTML
parse_mode + inline_keyboard + reply_to / thread_id SendOptions.
route_message calls the helper for the text-reply continuation path; the
callback-button continuation path (the same file's AskQuestionCommand)
still edits in place via ctx.executor.edit (unchanged).

Tests: 2 new regressions in tests/test_ask_user_question.py covering the
RenderedMessage shape, inline_keyboard presence, and SendOptions thread_id
both with and without a forum thread. Full suite: 2624 passed, 82.32% cov.

Closes #488

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: claude schema accepts server_tool_use + advisor_tool_result content blocks (#489)

Anthropic server-side tools (web_search, code_execution, computer_use, …)
emit `server_tool_use` content blocks in routine v2.x sessions; the parent
agent's `advisor()` meta-tool emits `advisor_tool_result` blocks. Untether's
msgspec schema didn't know either tag, so `decode_stream_json_line` raised
ValidationError and the runner silently dropped the entire JSONL line —
no progress action in Telegram, no entry in `state.pending_actions`,
no input to verbose-mode rendering or cost tracking. Sampling 24h of
staging traffic (2026-05-08) showed paired events firing across 5 different
projects (auditor-toolkit, scout, brand-copilot, aushistory) and 5 sessions.

Schema (src/untether/schemas/claude.py): add `StreamServerToolUseBlock`
(mirrors StreamToolUseBlock: id/name/input) and `StreamAdvisorToolResultBlock`
(mirrors StreamToolResultBlock: tool_use_id/content/is_error). Extend
`StreamContentBlock` union; parent message bodies pick up the new types
for free since they reference the union.

Translate (src/untether/runners/claude.py): widen the assistant-message
match arm so server_tool_use shares the existing tool_use body
(_register_background_handle and _observe_loop_tool_use already filter
on tool name and no-op cleanly for unrecognised server tools); widen the
user-message isinstance check so advisor_tool_result shares the existing
tool_result body. No new helpers, no new branches.

Tests: 3 schema round-trip tests in test_claude_schema.py
(test_decode_server_tool_use_block, test_decode_advisor_tool_result_block,
test_decode_advisor_tool_result_block_minimal), 2 translation tests in
test_claude_runner.py (test_translate_server_tool_use_block,
test_translate_advisor_tool_result_block) covering pending_actions
lifecycle and last_tool_use_id stamping.

Full suite: 2629 passed, 2 skipped, 82.32% coverage.

Closes #489

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: staging 0.35.3rc10 (#488, #489)

Bumps the dev branch to 0.35.3rc10 so the TestPyPI publish triggered by
this dev push actually publishes (skip-existing would no-op at rc9).
Bundles the AskUserQuestion multi-question crash fix (#488) and the
server_tool_use / advisor_tool_result schema gap (#489).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(claude-schema): allow tool_result.content to be a single dict (#501)

Claude Code emits tool_result and advisor_tool_result content blocks
where the inner `content` field is a single object (e.g.
`{"type": "text", "text": "..."}`) instead of the documented
str / list[dict] / null shapes. msgspec's schema only allowed the
documented shapes, so the line was rejected with ValidationError and
silently dropped via `jsonl.msgspec.invalid` warning — losing tool
tracking for that turn.

Add `dict[str, Any]` to the union on both StreamToolResultBlock and
StreamAdvisorToolResultBlock. _normalize_tool_result already handles
the dict shape, so no runner code change needed. Two new regression
tests in test_claude_schema.py cover both block types with dict content.

Verified against staging logs (14 occurrences today) and live-tested
against @untether_dev_bot — 0 msgspec errors after restart.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(runner): close proc.stderr after reader_done to unblock task group (#502)

When a Claude Code subprocess exits cleanly post-result event, the
runner's task group can block forever waiting on `drain_stderr`. The
cause: MCP server child processes inherit the parent's stderr fd and
keep it open. `iter_bytes_lines` then never sees EOF, `drain_stderr`
never returns, the task group never exits, and `proc.wait()` is never
reached — leaving the watchdog as the only safety net (and the watchdog
wrongly marks the session as cancelled/failed despite a clean rc=0).

Fix: close the parent's read end of stderr explicitly after
`reader_done.set()`. `iter_bytes_lines` already catches the resulting
ClosedResourceError and returns from drain_stderr, letting the task
group complete and proc.wait() report rc.

Applied to both call sites:
  - src/untether/runner.py (base runner, all engines)
  - src/untether/runners/claude.py (Claude override has its own block)

Verified live on @untether_dev_bot:
  - subprocess.exit pid=1394555 rc=0 fired immediately after result
  - session.summary cancelled=False ok=True (was: cancelled=True ok=False
    in the #502 timeline)
  - total elapsed 33s vs the 326.7s peak_idle in the bug report

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(settings): demote config.loaded INFO → DEBUG (#498)

`load_settings_if_exists()` is called per-helper (footer, watchdog,
progress, auto_continue, preamble, budget) on every handle_message —
fires 4–6 times per processed message by design (#269 hot-reload).
INFO level floods structlog at ~80 events per session, triggering
monitor `config_loaded_burst` alerts even though the underlying
behaviour is correct.

Demote to DEBUG. The reload behaviour is preserved (config edits still
apply on the next run without restart). The proper fix — caching
settings within handle_message to do one parse instead of N — is
deferred to v0.35.4 (#506) since it touches helper signatures and is
out of bug-fix-rc11 scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(runner): break stdout read after CompletedEvent in base runner (#505)

Mirrors Claude's override (added during #502). Without the break, any
non-Claude engine subprocess that emits its terminal event AND has a
child inheriting the stdout fd (MCP server, backgrounded shell) blocks
on iter_json_lines waiting for an EOF that never comes; proc.wait()
is then never reached and the task group hangs.

Per-engine audit (codex/opencode/pi/gemini/amp) confirms each emits
exactly one terminal event with no post-completion events, so the
unconditional break is safe.

Closes #505.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(claude): re-emit ExitPlanMode plan body + dead-wakeup idle shortcut (#508, #507)

#508 — Plan-mode research/audit runs no longer surface a short final
Telegram message that just points to a plan file. Capture the
ExitPlanMode plan body from tool_use.input.plan onto the new
ClaudeStreamState.last_exitplanmode_plan field; the bridge re-emits
it in the final answer when the post-approval result doesn't already
contain it. Live impact: 5m30s scout-project research run on staging
v0.35.3rc10 produced a 584-char brief acknowledgement instead of the
substantive findings.

#507 — ScheduleWakeup outside /loop dynamic mode no longer holds the
session alive indefinitely. New parallel state.live_wakeups_arm_delay
captures the original delaySeconds at arm time;
_post_result_idle_watchdog cuts its effective timeout to
min(timeout_s, max_armed_delay + 60s) when a wakeup is armed AND
_loop_enabled_for_chat is False. Live impact: session 845cfcc3-…
sat post-result idle for 58 minutes before manual /cancel.

Per CLAUDE.md (testing-conventions.md): 4 new tests in
test_claude_runner.py — capture, ignore-empty (#508), dead-wakeup
shortcut, /loop preserves default (#507).

Refs #507, #508. The bridge re-emit and preamble revisions for #508
ship in the next commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(bridge): preamble plan-mode clauses + ExitPlanMode plan-body re-emit (#508)

Layer A — _DEFAULT_PREAMBLE gains a Plan-mode requirements section:

- (A1) ExitPlanMode plan parameter MUST contain a 3-5 bullet
  substantive summary, never just a file path
- (A2) post-approval next assistant message MUST repeat the
  substantive findings (plan-body messages disappear after approval)
- (A3) ### Plan/Document Created bullet asks for inline key findings,
  not just a path pointer

Layer E — replace the dead-code _outline_prefix matcher in
handle_message with the new _prepend_exitplanmode_plan helper that
prepends the plan body (captured in state.last_exitplanmode_plan)
with a 📋 Plan (approved): header + separator when the post-approval
final answer doesn't already contain it. Substring-only gate (no
length threshold — live repro had answer_len=584).

8 new tests in tests/test_preamble.py: A1/A2/A3 clauses present, plus
5 _prepend_exitplanmode_plan cases (short final, substring-skip,
no-plan, empty, None final).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs(changelog): rc11 entries for #505, #507, #508

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: staging 0.35.3rc11

Bumps version to 0.35.3rc11 for TestPyPI staging release.

This rc bundles three monitor/bridge fixes already on this branch:
- #505: base runner _iter_jsonl_events breaks loop after CompletedEvent
- #507: dead ScheduleWakeup outside /loop no longer holds session
- #508: ExitPlanMode plan body re-emit + preamble plan-mode clauses

Local CI mirror: ruff format/check clean, 2644 tests passing,
build + twine check PASSED on both sdist and wheel.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(tests): satisfy PERF401 in test_base_iter_jsonl_breaks_on_did_emit_completed

CI ruff check failed on the new #505 regression test — the local pre-flight
only ran `ruff check src/` whereas CI runs the whole repo. Replaces the
explicit append loop with an async-comprehension list initialiser, keeping
`anyio.fail_after(2.0)` wrapping the iteration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(rc12) (#511)

Closes the cross-chat plan-body leak observed on staging v0.35.3rc11. Moves the #508 _prepend_exitplanmode_plan from the racy bridge read of runner.current_stream to the per-stream StreamResultMessage translation path in claude.py. Three new regression tests cover the per-stream prepend, concurrent-state isolation, and error-path skip. Live smoke on @untether_dev_bot confirmed #508 UX preserved.
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.35.3 to 4.35.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](github/codeql-action@e46ed2c...68bde55)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 4.35.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…516)

Closes the rc11/rc12 over-correction on #508 that produced 25k–42k char
(~8–12 Telegram message) finals on staging plan-mode research/audit
runs. User report (Nathan, 2026-05-12): "I had a summary from Claude
Code yesterday which was 11 Telegram messages long!! What I really
want back is to have Claude Code provide summaries like we have here
in command line — summaries of plans (not the entire plan), summaries
of recommendations and/or findings and/or next steps (where relevant)."

Three stacked over-shoots in rc11/rc12:

1. A1 preamble: "expand the bullets into a substantive summary" for
   research/audit → plan body ballooned to 2–5k chars.
2. A2 preamble: "your next assistant message ... MUST repeat the
   substantive findings" → post-approval text ballooned to 0.5–2k
   chars AND was paraphrased rather than literal-copied.
3. Layer E: substring-skip rule (body in final_answer) failed on every
   paraphrased run, so the plan body was unconditionally concatenated
   in front of the post-approval text.

Evidence from `journalctl --user -u untether.service` (last 48h on
staging @hetz_lba1_bot v0.35.3rc12): aushistory finals at 14k / 16k /
28k / 35k / 42k chars; scout finals at 26k / 27k chars. The 42k case
matches the 11-message user repro. Telegram MCP `search_messages` for
the literal "📋 Plan (approved):" returned hits on every recent
plan-mode completion in both chats — confirming Layer E was the
load-bearing over-firer.

rc13 retuning:

- A1 → "concise 3–5 bullet summary; plan is shown for approval, not
  as the final deliverable" (drops the substantive-expansion license).
- A2 → "brief CLI-style summary, 3–7 bullets or 1–2 short paragraphs,
  ~500–1500 chars, do NOT re-paste the full plan content".
- A3 (## Summary Plan/Document Created bullet) → "Path AND a 3–5
  bullet headline summary, not a re-paste of the full content". Note:
  A3 affects the ## Summary block on ALL completed work, not just
  plan-mode runs — intentional, matches user's stated goal.
- _prepend_exitplanmode_plan: substring check replaced with a length
  gate (`len(final_answer) < 600`). Substring check stays as a cheap
  belt-and-braces second skip. Plan body is capped at 1500 chars +
  truncation marker so a runaway body can't ship 30k chars even when
  Layer E does fire (preserves original #508 UX for genuinely empty
  post-approval results without re-introducing concatenation).

Live verification on @untether_dev_bot (test chat -5284581592):

- Primed test (with "keep it short" instruction): answer_len=882
  chars (~1 Telegram message), no "📋 Plan (approved):" literal.
- Unprimed test (default research-task prompt): answer_len=1019 chars
  — preamble is doing its job without user help. Layer E correctly
  skipped (1019 > 600). Quality verified: 3 substantive bullets +
  ## Summary block with Completed / Next Steps.

The original #508 fallback path (Claude exits with very short post-
approval text → Layer E fires with capped plan body) is unit-tested
only; not live-verified because the new preamble makes it almost
impossible to repro intentionally.

Tests: 7 new/updated in tests/test_preamble.py (regression-locks the
rc11 verbosity-driving phrases out of _DEFAULT_PREAMBLE, plus
length-gate / body-cap / substring-skip cases) and 2 in
tests/test_claude_runner.py (`test_translate_result_skips_prepend_
when_answer_substantive`, `test_translate_result_caps_long_plan_body_
when_prepending`). Full suite: 2652 passed, 2 skipped, 82.38%
coverage. ruff format + check clean.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates the requirements on [uv-build](https://github.com/astral-sh/uv) to permit the latest version.
- [Release notes](https://github.com/astral-sh/uv/releases)
- [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md)
- [Commits](astral-sh/uv@0.9.18...0.11.13)

---
updated-dependencies:
- dependency-name: uv-build
  dependency-version: 0.11.13
  dependency-type: direct:development
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Add "From the blog" section between Acknowledgements and Licence with
links to two littlebearapps.com posts directly related to Untether
(Coding from the park, Dogfooding bugs tests can't find). Uses raw HTML
anchors with target="_blank" rel="noopener noreferrer" so links open in
a new tab on GitHub (PyPI may strip target — acceptable).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented May 15, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: de60b7a1-0861-48b2-bbb7-fb89d67dc46f

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/readme-blog-links

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Base automatically changed from dev to master May 26, 2026 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant