docs: pre-1.0 patch judgement note + Pi model footer matrix fix by nathanschram · Pull Request #469 · littlebearapps/untether

Nathan Schram (nathanschram) · 2026-05-03T05:42:35Z

Summary

Add Pre-1.0 minor judgement paragraph to .claude/rules/release-discipline.md clarifying when opt-in additive features can ship in patches vs minors. Codifies precedent already in use (#409 shipped a config addition in a patch).
Fix engine compatibility matrix in README.md: Pi Model in footer cell changes from — to ✅⁷ with footnote 7 explaining the supplementary StartedEvent mechanism. Code at src/untether/runners/pi.py:290-307 already does this (since #225); the matrix was stale.

Why

Both fixes surfaced during the Pi engine audit that produced the upcoming roadmap:

9 new Pi issues created and assigned: bug(pi): wire AutoRetry events (schema-defined but never handled) #460-feat(pi): /providers Telegram command — list providers + auth status #468 across v0.35.4-v0.35.7
2 existing Future issues promoted: feat: Pi interactive approval via RPC mode (--mode rpc) #170 → v0.35.8, feat: Pi extension bridge — lifecycle hooks via extension system #180 → v0.35.9
Empty v0.36.0 milestone closed

The judgement-note edit removes ambiguity for the v0.35.8 RPC runner decision (#170). The matrix fix is a pure code-doc reconciliation — no functional change.

Test plan

No code changed — docs only
Format: git diff --stat → 2 files, 4 insertions, 1 deletion
Local lint (skipped — no Python/code files touched)

Notes

The driving roadmap doc lives at docs/plans/2026-05-02-pi-engine-enhancements.md (local — docs/plans/ is gitignored)
See .untether-outbox/pi-issue-drafts.md for the full issue body drafts that were created as bug(pi): wire AutoRetry events (schema-defined but never handled) #460-feat(pi): /providers Telegram command — list providers + auth status #468

🤖 Generated with Claude Code

Enables `[claude] extra_args = ["--chrome"]` so Untether-spawned Claude Code sessions can opt into the Claude-in-Chrome extension — previously the `mcp__claude-in-chrome__*` tool namespace was absent from Untether sessions because Claude Code 2.1.x gates it behind `--chrome` / `CLAUDE_CODE_ENABLE_CFC=1`, and Untether never passed the flag. Mirrors `codex.extra_args` and `pi.extra_args`. Flags Untether manages internally (`-p`, `--print`, `--output-format`, `--input-format`, `--resume`/`-r`, `--continue`/`-c`, `--permission-mode`, `--permission-prompt-tool`) are rejected at config-load with a `ConfigError` so duplicate-argv surprises fail fast. User args land on argv after the managed stream-json prelude and before resume / model / effort / allowed-tools / permission flags, preserving the trailing `-p <prompt>` (or stdin prompt under permission-mode) position. - src/untether/runners/claude.py: add `extra_args` field, thread through `build_args`, parse + validate in `build_runner` - tests/test_build_args.py: +8 tests (argv ordering, permission-mode argv, multi-flag order, build_runner parsing, reserved-flag rejection for individual flags and `key=value` prefixes) - docs/reference/config.md, docs/reference/runners/claude/runner.md: document the new key, including reserved-flag list - CHANGELOG.md: v0.35.3 (unreleased) entry Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* chore: staging 0.35.3rc1 Stage Claude extra_args (#407) for TestPyPI. This rc1 is the wheel the Mac Untether instance will install to validate Claude-in-Chrome end-to-end per docs/audits/2026-04-21-claude-in-chrome-test-plan.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * deps: bump lxml 6.0.2→6.1.0 and python-dotenv 1.2.1→1.2.2 pip-audit flagged two new transitive CVEs after PR #408 merged: - lxml 6.0.2: CVE-2026-41066 (fix 6.1.0) — pulled via sulguk - python-dotenv 1.2.1: CVE-2026-28684 (fix 1.2.2) — pulled via pydantic-settings Both have clean fixes. Lockfile-only change; pyproject.toml constraints unchanged. Local pip-audit clean after bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(security): Group 1A hygiene — 8 issues Bundles eight low-risk security hygiene fixes for v0.35.3: - #205 — split runner.start log so prompt content stays at DEBUG - #206 — flip AMP dangerously_allow_all default to False (opt-in only) - #207 — Pi session dir created with mode 0o700 + chmod existing - #208 — extend stderr sanitisation to /Users, /private/var, /tmp, /var, /opt, /srv, /etc, /usr/local, /app, /workspace, /root - #211 — replace stat()+read_bytes() with capped streaming read in anyio worker thread; closes TOCTOU window on /file get - #213 — add OPENAI_PROJECT_KEY_RE for sk-proj-... redaction (the underscore/hyphen char set is not covered by the generic sk- pattern) - #402 — bump Pygments 2.19.2 → 2.20.0 via uv lock (CVE-2026-4539 ReDoS, transitive) - #403 — replace 123456789:ABCdef… placeholder bot tokens with <BOT_ID>:<BOT_TOKEN> in non-test paths (onboarding.py, install.md, llms-full.txt); test fixtures kept as-is for GitHub-UI dismissal All 2410 tests pass; ruff check + format clean; uv lock --check ok. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: silence bandit B108 false positive + ignore CVE-2026-3219 - bandit B108 fires on the new /tmp/ regex pattern in _PATH_PATTERNS at runner.py — regex for stderr redaction, not a hardcoded temp-file write. Suppressed with `# nosec B108` matching the existing render.py:111 pattern. - pip-audit now flags pip 26.0.1 → CVE-2026-3219 (advisory published recently; no fix available upstream). Added to the --ignore-vuln list alongside CVE-2026-4539 (pygments — kept for posterity even though #402 lockfile bump fixed it). No source/test code changes. CI-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

) `_daily_cost` is a module-level tuple updated via read-modify-write in record_run_cost(). Concurrent finalize_run callers could both read (today, X), both write (today, X + cost), and lose one run's cost — letting a malicious or runaway concurrent workload defeat the per-day budget gate. Fix: wrap the RMW block in a `threading.Lock`. Critical section is a single tuple assignment (sub-microsecond), so the lock is fine under both async (cooperative) and threaded callers without an async-signature ripple. get_daily_cost() also acquires the lock for snapshot consistency. Trade-off note: kept the function sync rather than pivoting to `anyio.Lock` because that would require updating the 6 sync test call sites and the 1 sync caller in runner_bridge.py — needless churn for a sub-microsecond critical section. Test: new ThreadPoolExecutor-driven fuzz test (16 workers, 200 calls) asserts the observed total equals n * unit_cost — would fail under racing RMW. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Brings the voice transcription API key into parity with `bot_token` (closed #196): SecretStr masks the value in repr()/str()/tracebacks and any accidental structlog serialisation. Access the raw value via `.get_secret_value()` at the transport boundary. Changes: - `settings.py`: field type `NonEmptyStr | None` → `SecretStr | None`; new `_validate_voice_key_not_empty` validator preserves the prior no-empty-string contract by round-tripping `""`/whitespace to None - `telegram/bridge.py`: `TelegramBridgeConfig.voice_transcription_api_key` annotation → `SecretStr | None`; `update_from()` unchanged (assigns SecretStr to SecretStr) - `telegram/loop.py:2208`: sole unwrap point — call `.get_secret_value()` only when non-None before passing to `transcribe_voice` (OpenAI SDK still wants raw `str | None`) - `telegram/voice.py`: unchanged; boundary stays at the loop caller Tests: - `test_settings.py`: new `test_voice_transcription_api_key_is_secret_str` (round-trip + repr/str masking), `_empty_string_normalised_to_none` (whitespace → None), `_default_none` (omitted → None) - `test_bridge_config_reload.py`: hot-reload tests updated to use `.get_secret_value()` for value comparison - `test_telegram_backend.py`: updated build_and_run assertion All 2413 tests pass; ruff check + format clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bump rc1 → rc2 to publish a fresh staging wheel that includes: - #431 — Group 1A security hygiene (8 issues: #205, #206, #207, #208, #211, #213, #402, #403) - #432 — #379 daily cost tracker race (threading.Lock guard) - #433 — #378 voice_transcription_api_key SecretStr rc1 (b6c6ad6) only carried #407 (Claude extra_args). rc2 supersedes it on TestPyPI. No CHANGELOG entry — per release-discipline.md §"Staging / rc versions", entries batch into the stable bump. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ult (#409) (#435) Self-installed Untether users in heterogeneous environments need to thread credential-manager tokens (1Password, Doppler, Vault, Infisical, …) into engine subprocesses. Today the env allowlist is hard-coded in `utils/env_policy.py` so adding a single var requires a fork + release. Changes: - `utils/env_policy.py`: - new `is_allowed_with_extras(name, extra_exact=, extra_prefix=)` - `filtered_env()` extended with `extra_prefix=` parameter - new `log_user_extensions_once()` — module-level latch emits one `env_policy.user_extension` INFO per process when user extras are active, so the operator sees the addition in journalctl - `settings.py` `SecuritySettings`: - `env_extra_allow: list[str]` (default `[]`) - `env_extra_prefix_allow: list[str]` (default `[]`) - field validators reject empty/whitespace and enforce `[A-Z_][A-Z0-9_]*` - `runners/claude.py`, `runners/pi.py`: - new `_load_env_extras()` helper (best-effort settings load — never blocks a run on a config error, mirrors the env_audit pattern) - threads extras through `filtered_env()` + `log_user_extensions_once()` - `utils/env_audit.py` `audit_proc_env()`: - new `user_extra_exact=`/`user_extra_prefix=` params so user-allowed names aren't false-flagged as `claude.env_audit.leaked_var` - Built-in defaults: `BWS_ACCESS_TOKEN` promoted into `_EXACT_ALLOW` (Bitwarden Secrets Manager — common enough to ship as a default). - Docs: `docs/reference/config.md` `[security]` table, CLAUDE.md features list. Tests: +19 across `tests/test_env_policy.py` (8 user-extension cases + log latch), `tests/test_env_audit.py` (4 user-extras cases), and `tests/test_settings.py` (7 round-trip + validator cases). `uv run pytest` → 2432 passed, 2 skipped; ruff clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bump rc2 → rc3 to publish a fresh staging wheel that includes #435. Cumulative since rc1: - #431 — Group 1A security hygiene (8 issues: #205, #206, #207, #208, #211, #213, #402, #403) - #432 — #379 daily cost tracker race (threading.Lock guard) - #433 — #378 voice_transcription_api_key SecretStr - #435 — #409 user-extensible env allowlist + BWS_ACCESS_TOKEN default No CHANGELOG entry — per release-discipline.md §"Staging / rc versions", entries batch into the stable bump. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

) (#437) #377 fix: - `TelegramTransportSettings` gains `allow_any_user: bool = False` (opt-in escape hatch) and `_validate_allowed_user_ids_or_optin` model_validator raising ValueError when `allowed_user_ids == []` and `allow_any_user is False`. Pre-v0.35.3 the empty default silently shipped open bots — this is the v0.35.3 promotion of the warning to a hard ConfigError. - `TelegramBridgeConfig` and `update_from()` carry the new field through hot-reload; backend constructs with the value. - `telegram/loop.py` drops the per-update `security.no_allowed_users` warning (validator now blocks startup) and emits `security.allow_any_user` INFO every boot when the opt-out is in effect. - `config_migrations.py` `_migrate_legacy_telegram` relocates a top-level `allow_any_user` key into `[transports.telegram]` alongside `bot_token` / `chat_id` so legacy configs migrate cleanly. CHANGELOG: backfilled `## v0.35.3 (unreleased)` with `### breaking`, `### changes`, `### fixes` subsections covering all 13 issues that shipped in rc1-rc4 (#205, #206, #207, #208, #211, #213, #377, #378, #379, #402, #403, #407, #409). Per release-discipline.md the section heading stays `(unreleased)` until the dev → master stable bump populates the date. Docs sweep: - `docs/how-to/security.md` — required-allowlist wording, dev/demo opt-out callout, env_extra_allow / env_extra_prefix_allow extension guide, sk-proj redaction note, voice-key SecretStr note. - `docs/how-to/troubleshooting.md` — new top-of-page section for `allowed_user_ids is empty` startup error. - `docs/how-to/group-chat.md` — required wording. - `docs/how-to/operations.md` — `env_extra_allow` + `allow_any_user` added to hot-reloadable list. - `docs/tutorials/install.md` — `allowed_user_ids` added to all three example configs (assistant / workspace / handoff). - `docs/reference/config.md` — `allow_any_user` row added, `allowed_user_ids` flipped to required, AMP `dangerously_allow_all` default note flipped to `false`. - `docs/reference/runners/amp/runner.md` — flag is now optional; `dangerously_allow_all = false` example. - `docs/reference/env-vars.md` — `BWS_ACCESS_TOKEN` default mention, `[security] env_extra_*` extension subsection. Test fixtures: - ~30 test fixtures across `test_settings`, `test_cli_*`, `test_projects_config`, `test_telegram_backend`, `test_bridge_config_reload`, `test_config_watch`, `test_config_path_env`, `test_onboarding*`, `test_runtime_loader`, `test_settings_contract`, `test_exec_bridge` patched to add `allow_any_user = true` (or `"allow_any_user": True`) where the fixture exercises non-allowlist behaviour. Tests that specifically cover #377 use `populated allowlist` cases. #377 tests: 4 new in `test_settings.py` covering block + opt-out + populated + both-set. GitHub housekeeping (parallel to this commit, not in the diff): - Closed #205, #206, #207, #208, #211, #213, #378, #379, #402, #403, #409 with implementation references. #377 closes via this PR's body. Version: 0.35.3rc3 → 0.35.3rc4 (`pyproject.toml`, `uv.lock`). Verification: 2436 tests pass / 2 skipped (~68s). Ruff check + format clean. uv lock --check in sync. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Replace the literal "Basic dXNlcjpwYXNz" string in test_malformed_bearer_header with a runtime-constructed header so GitHub's secret-scanner stops flagging it. The test still asserts verify_auth rejects Basic auth — Untether webhooks only accept Bearer + HMAC. The corresponding GitHub secret-scanning alert is a true false positive (test fixture, not a real credential) and will be dismissed in the GitHub UI as "Used in tests / false positive". Closes #404

…-approve safety (#380) (#442) The 2026-04-20 audit (§ASI02) flagged ``ControlRewindFilesRequest`` and ``ControlMcpMessageRequest`` as worth a deeper look because rewind could in principle undo state that drove a prior denial decision and MCP messages could carry tainted payloads from a compromised MCP server. Audit verdict: both are safe to auto-approve under the current upstream Claude Code 2.1.x trust model. - mcp_message: Untether is a transport pass-through; the message payload is opaque storage and is never inspected, executed, or rendered. A compromised MCP server is the inherent threat model of any MCP server, not specific to auto-approve. Routing this through Telegram approval would not block the payload. - rewind_files: rewind is user-initiated upstream (the model cannot trigger it autonomously). Untether's per-session approval state (_PLAN_EXIT_APPROVED, _DISCUSS_APPROVED, _HANDLED_REQUESTS) is NOT mutated by rewind. Subsequent writes still pass through the standard ControlCanUseToolRequest gate. No code change beyond: 1. Multi-paragraph safety-invariant comment in src/untether/runners/claude.py near _AUTO_APPROVE_TYPES, including the re-audit trigger (upstream semantic change to either subtype). 2. 3 regression-lock tests in tests/test_claude_control.py::TestAutoApproveSafetyInvariant that fail loudly if the auto-approve path starts inspecting payloads or coupling to per-session approval state. 3. Audit memo at docs/audits/2026-04-27-380-auto-approve-scope-review.md. Closes #380 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… (#440) The chat-level message-routing command (`all` / `mentions` / `clear`) shared a name with the unrelated webhook/cron triggers system, which became increasingly confusing as `/config` grew separate trigger pages. User-visible changes: - New `/listen` command (`all`/`mentions`/`clear`) replaces `/trigger` - `/trigger` continues to work as a deprecated alias for one release cycle and prepends a one-line deprecation notice - `/config → 📡 Listen` page replaces `📡 Trigger` - Home page summary renders `Listen: all` instead of `Trigger: all` - Bot command menu lists `listen` instead of `trigger` Internal renames: - `telegram/trigger_mode.py` → `telegram/listen_mode.py` - `commands/trigger.py` → `commands/listen.py` - Type `TriggerMode` → `ListenMode` - Function `resolve_trigger_mode` → `resolve_listen_mode` - ChatPrefsStore / TopicStateStore: new `*_listen_mode` methods; legacy `*_trigger_mode` methods preserved as one-release aliases Storage: msgspec field is still named `trigger_mode` for backward compat with existing `telegram_chat_prefs_state.json` / `telegram_topics_state.json` files. No migration is needed. Tests: full suite passes (2438 passed, 2 skipped). Two new tests in test_telegram_agent_trigger_commands.py cover the deprecation prefix and clean `/listen` output. test_config_command toast expectations updated to "Listen: ...". Closes #297 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Adds a global pause control for the trigger system (crons + webhooks) accessible via /config in Telegram. During pause: - Cron scheduler skips its tick — run_once crons are NOT consumed and fire on the next matching tick after resume - Webhook server returns 503 (with Retry-After: 60) instead of dispatching, so external monitors can distinguish paused-but-up from healthy. Returns 404 for unknown paths as before - /health endpoint surfaces {"status":"paused","paused":true} Pause is in-memory only — restart auto-resumes. This is the safe default per the issue's recommendation, and mirrors /at scheduler behaviour. UI: - New /config home-page row "⏸ Pause triggers" / "▶️ Resume triggers" appears only when triggers are configured - New dedicated "📡 Triggers" page (config:tg) showing state + counts with Pause/Resume button; gracefully handles no-trigger-manager and zero-config cases - /ping shows "⏸ triggers paused: … (suspended)" indicator while paused Tests: 15 new tests across test_trigger_manager.py (8 pause toggle behaviours including 503 webhook check), test_ping_command.py (2 paused/resumed indicators), and test_config_command.py (5 TestTriggersPage covering unavailable/empty/pause/resume/toast). Full suite: 2445 passed, 2 skipped. Closes #294 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…fication (#438) (#443) Adds [watchdog] claude_stream_idle_timeout_ms (default 300_000 ms, range 30 s – 30 min) so deployments hitting upstream Anthropic API stalls on long opus 4.7 1M plan-mode generations can raise the watchdog without forking the codebase. Untether's Claude runner reads the value via setdefault — shell-set CLAUDE_STREAM_IDLE_TIMEOUT_MS still wins. Settings load failure falls back to the hardcoded 300_000 default with a debug log entry. Type-A vs Type-B classification on the failure message: - Type A — mid-generation stall (num_turns >= 1 && duration_api_ms > 0). Often legitimate long opus reasoning that exceeded the watchdog. Inline hint suggests raising the new config knob. - Type B — cold-start zero-byte stall (num_turns <= 1 && duration_api_ms == 0). Upstream API outage — raising the timeout will NOT help. Inline message says so explicitly. Auto-retry on Stream idle timeout deferred to v0.35.4 pending upstream Anthropic stabilisation (8 duplicate api:anthropic issues filed 2026-04-17→26 across macOS/Windows/web/WSL). Tests: 5 new tests in test_claude_runner.py. Full suite 2460 passed, 2 skipped. Lint clean. Closes #438 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…410) (#444) Promotes claude_usage.schema_mismatch from one-shot per-process to per-call counter so the issue-watcher catches ongoing API-shape drift instead of just the first hit. Structured event carries a cumulative `count` field; new runner_bridge.get_usage_schema_mismatch_count() exposes the counter for the debug page. UsageCacheStats added to utils/usage_cache.py tracking last successful fetch wall time, cache age, last-error class+message; populated on every fetch path including stale-while-error fallbacks. _read_token_expiry_ms() added to telegram/commands/usage.py so the OAuth token expiry can be surfaced without raising on missing credentials (best-effort: returns None on any read failure). /usage debug appends a 🔧 debug block (HTML) showing: - last successful fetch (UTC ISO + age + fresh/stale label) - last error (class + message, 120-char truncated) - OAuth token expiry (with hh/mm remaining) - cumulative schema-mismatch counter Operator-facing signal so the next time the subscription footer goes silent, the root cause is visible without grepping journalctl. Tests: 5 new in test_usage_cache.py::TestCacheStatsObservability; 1 in test_command_engine_gates.py::TestUsageDebugMode; existing test_schema_mismatch_warning_fires_once repurposed to assert per-call firing with cumulative counts. Full suite: 2465 passed, 2 skipped. Closes #410 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…n + last-fired history + /stats breakdown (#271) (#445) Tier 2: `/config → ⏰ Triggers` now lists every cron and webhook configured for the current chat. Crons render as `id · describe_cron(...) · proj · eng · last X` and webhooks as `id · path · auth · proj · eng · last X`. Lists are scoped via `crons_for_chat`/`webhooks_for_chat` with the bridge default_chat_id fallback, capped at 10 entries with an overflow marker, and omitted when the chat has no triggers (pause/resume controls remain regardless). Tier 3: new `triggers/history.py` JSON store at `<config_path>.with_name("triggers_history.json")`. Records `time.time()` after every successful cron dispatch (cron.py:130) and webhook dispatch (dispatcher.py:dispatch_webhook + dispatch_action). Recording is best-effort — OSError writes log `triggers.history.write_failed` and swallow. `/stats` appends `(N triggered, M manual)` per engine line and on the totals row when at least one count > 0. `DayBucket`/`AggregatedStats` carry additive `triggered_count`/`manual_count` with `.get(..., 0)` fallbacks so existing stats.json files load cleanly. `runner_bridge.handle_message` resolves the split via `triggered=bool(context and context.trigger_source)`. 28 new tests: 10 in test_triggers_history.py (round-trip, corrupt JSON, version mismatch, persistence), 7 in test_session_stats.py (triggered/manual split, back-compat with old format), 3 in test_stats_command.py (breakdown present/omitted/totals), 7 in test_config_command.py::TestTriggersPagePerChat (crons listed, webhooks listed, chat filtering, default_chat_id fallback, last-fired rendering, overflow cap), 2 in test_trigger_cron.py (cron firing records last_fired + history failure resilience), 2 in test_trigger_dispatcher.py (webhook records last_fired + history failure resilience). Full suite: 2496 passed, coverage 82.18%. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…) (#446) After a Claude bidirectional session emits `result`, the CLI keeps stdin open so multi-turn sessions don't re-spawn. In practice this leaves a 400 MB RSS subprocess + ~200 TCP sockets idling for 30+ minutes between prompts, and from the user's perspective the session looks "stuck" — final message rendered, no further indication of state. Option D hybrid: - New `[watchdog].post_result_idle_enabled = true` (kill switch) and `[watchdog].post_result_idle_timeout = 600.0` (30s–1h) in settings. - `ClaudeStreamState.result_received_at` armed by `translate_claude_event` on every `StreamResultMessage` (re-armed per turn so multi-turn works). - New `ClaudeRunner._post_result_idle_watchdog` task runs in the existing `run_impl` task group when `use_control_channel` is True. Polls the timer; when the deadline passes, calls `this_proc_stdin.aclose()` (same mechanism as the normal-flow exit at line 2412, just earlier). CLI hits stdin EOF and exits gracefully (rc=0). - Auto-continue safety: the existing `_should_auto_continue` gate excludes `last_event_type == "result"` (locked by `test_skips_result_event_type` in test_exec_bridge.py), so the clean rc=0 exit will not phantom-resume the session. - Approval-state guard: if `_REQUEST_TO_SESSION` or `_PENDING_ASK_REQUESTS` has live entries for this session, defer the close (re-arm the timer) to avoid orphaning a button-click control_response in flight. UX hint #1: a supplementary `StartedEvent` with `meta={"complete": "✓ turn complete"}` is emitted alongside `CompletedEvent` on successful results (the supported pattern for late-arriving meta per runner-development.md). `markdown.format_meta_line` renders it in the footer so the user sees the turn boundary immediately. Errored results don't get the hint (no false "complete" tag on a failure). Two structlog events for ops: - `claude.post_result_idle.deferred` — approval guard suppressed close - `claude.post_result_idle.closing_stdin` — deadline passed, stdin closed 7 new tests in test_claude_runner.py: result-event arms timer, emits turn-complete meta, skips meta on error, watchdog fires when clean, watchdog defers when pending approval, format_meta_line renders the hint when present and omits it when absent. Full suite: 2503 passed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…#447) Closes #269. The four settings groups in the issue had different states: - [footer]: already loads fresh per-message via _load_footer_settings (no work) - [cost]: already loads fresh per-call inside _check_cost_budget (no work) - [watchdog]: already loads fresh per-run via _load_watchdog_settings at the top of handle_message (no work — verified, applies on next run) - [progress]: was baked in at startup via MarkdownFormatter constructor + ExecBridgeConfig.min_render_interval — this PR closes that gap Changes: - markdown.py: new MarkdownFormatter.refresh_from(progress_settings) updates max_actions + verbosity from a fresh ProgressSettings snapshot. Tolerates missing/invalid attributes (clamps negative max_actions to 0; ignores unknown verbosity values). - telegram/bridge.py: new TelegramPresenter.refresh_progress_settings() delegates to formatter.refresh_from. - runner_bridge.py: new _load_progress_settings() sibling of _load_footer_settings / _load_watchdog_settings; handle_message reads it fresh per-run, calls cfg.presenter.refresh_progress_settings(...) via duck-typed getattr (Presenter is a Protocol, so we don't add to it), and threads progress_cfg.min_render_interval into each ProgressEdits instance instead of the startup snapshot. Per-chat /verbose overrides downstream of _resolve_presenter reconstruct from the refreshed defaults. Out of scope (entry-point limitation): engine + command registration still require pipx upgrade / restart. Documented on the issue. 8 new tests in tests/test_meta_line.py: TestMarkdownFormatterRefresh covers max_actions update, verbosity update, negative clamp, invalid-verbosity rejection, missing-attribute tolerance, presenter delegation. Plus _load_progress_settings defaults / error-fallback. Full suite: 2511 passed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

All 9 v0.35.3 Group 2 issues now landed on dev: - #404 — secret-scanning alert (PR #439) - #297 — /trigger → /listen rename + alias (PR #440) - #294 — master trigger pause/resume toggle (PR #441) - #380 — auto-approve scope review (PR #442) - #438 — claude_stream_idle_timeout_ms + Type-A/B classification (PR #443) - #410 — subscription usage observability + /usage debug (PR #444) - #271 — trigger visibility Tier 2 + Tier 3 (PR #445) - #333 — Claude post-result idle timeout + ✓ turn complete UX hint (PR #446) - #269 — hot-reload [progress] settings (PR #447) Bumps to TestPyPI for staging via @hetz_lba1_bot once integration tests U1-U7 pass against @untether_dev_bot. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Two small docs-only updates: - .claude/rules/release-discipline.md — add "Pre-1.0 minor judgement" paragraph clarifying that opt-in additive features behind config flags may ship in patches if they preserve all current defaults. Codifies precedent set by [security] env_extra_allow (#409 in v0.35.3). Removes ambiguity for upcoming v0.35.8 (Pi RPC, #170) decision. - README.md — fix engine compatibility matrix: Pi "Model in footer" cell changes from "—" to "✅⁷" with footnote explaining Pi populates the footer model from a supplementary StartedEvent carrying the model name extracted from message_end (#225). Code at src/untether/runners/pi.py:290-307 has implemented this since #225 landed; matrix was stale. Related: docs/plans/2026-05-02-pi-engine-enhancements.md (local plan, gitignored) — covers the broader Pi enhancement roadmap that motivated both fixes. Issues created: #460-#468 + promotions of #170 → v0.35.8 and #180 → v0.35.9. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

coderabbitai · 2026-05-03T05:42:41Z

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: ad69bb2d-6469-4511-9081-202522ec07bf

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/pi-roadmap-and-doc-fixes

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

- docs/reference/dev-instance.md: add Engine auth & defaults section (matrix of CLI ver, auth method, default provider/model, cred location for Claude/Codex/Gemini/OpenCode/Pi/AMP), smoke verification commands, per-engine notes (OpenCode google-whitelist=[], Pi Kimi K2.6 zero-cost, Gemini trusted folders + #471, AMP mode-not-model routing). Also document the three special-purpose instances (demo, dev-hf, dev-ws) and the lockfile PID-reuse race - docs/reference/integration-testing.md: cross-reference the new auth section so test runs check current state before tier 1 - CLAUDE.md, .claude/rules/dev-workflow.md: extend 2-instance section to the full 5-instance topology, clarifying that demo/dev-hf/dev-ws share the editable .venv but are not on the release path - .claude/hooks.json: dev-workflow-guard now recognises all 4 editable-source services (untether-dev, untether-dev-hf, untether-dev-ws, untether-demo) and only blocks restarts of staging Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Nathan Schram (nathanschram) and others added 20 commits April 22, 2026 16:06

Base automatically changed from dev to master May 26, 2026 06:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

docs: pre-1.0 patch judgement note + Pi model footer matrix fix#469

docs: pre-1.0 patch judgement note + Pi model footer matrix fix#469
Nathan Schram (nathanschram) wants to merge 21 commits into
masterfrom
feature/pi-roadmap-and-doc-fixes

Nathan Schram (nathanschram) commented May 3, 2026

Uh oh!

coderabbitai Bot commented May 3, 2026 •

edited

Loading

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Nathan Schram (nathanschram) commented May 3, 2026

Summary

Why

Test plan

Notes

Uh oh!

coderabbitai Bot commented May 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented May 3, 2026 •

edited

Loading