docs: link Untether blog posts in README footer#534
docs: link Untether blog posts in README footer#534Nathan Schram (nathanschram) wants to merge 39 commits into
Conversation
Enables `[claude] extra_args = ["--chrome"]` so Untether-spawned Claude Code sessions can opt into the Claude-in-Chrome extension — previously the `mcp__claude-in-chrome__*` tool namespace was absent from Untether sessions because Claude Code 2.1.x gates it behind `--chrome` / `CLAUDE_CODE_ENABLE_CFC=1`, and Untether never passed the flag. Mirrors `codex.extra_args` and `pi.extra_args`. Flags Untether manages internally (`-p`, `--print`, `--output-format`, `--input-format`, `--resume`/`-r`, `--continue`/`-c`, `--permission-mode`, `--permission-prompt-tool`) are rejected at config-load with a `ConfigError` so duplicate-argv surprises fail fast. User args land on argv after the managed stream-json prelude and before resume / model / effort / allowed-tools / permission flags, preserving the trailing `-p <prompt>` (or stdin prompt under permission-mode) position. - src/untether/runners/claude.py: add `extra_args` field, thread through `build_args`, parse + validate in `build_runner` - tests/test_build_args.py: +8 tests (argv ordering, permission-mode argv, multi-flag order, build_runner parsing, reserved-flag rejection for individual flags and `key=value` prefixes) - docs/reference/config.md, docs/reference/runners/claude/runner.md: document the new key, including reserved-flag list - CHANGELOG.md: v0.35.3 (unreleased) entry Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore: staging 0.35.3rc1 Stage Claude extra_args (#407) for TestPyPI. This rc1 is the wheel the Mac Untether instance will install to validate Claude-in-Chrome end-to-end per docs/audits/2026-04-21-claude-in-chrome-test-plan.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * deps: bump lxml 6.0.2→6.1.0 and python-dotenv 1.2.1→1.2.2 pip-audit flagged two new transitive CVEs after PR #408 merged: - lxml 6.0.2: CVE-2026-41066 (fix 6.1.0) — pulled via sulguk - python-dotenv 1.2.1: CVE-2026-28684 (fix 1.2.2) — pulled via pydantic-settings Both have clean fixes. Lockfile-only change; pyproject.toml constraints unchanged. Local pip-audit clean after bump. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(security): Group 1A hygiene — 8 issues Bundles eight low-risk security hygiene fixes for v0.35.3: - #205 — split runner.start log so prompt content stays at DEBUG - #206 — flip AMP dangerously_allow_all default to False (opt-in only) - #207 — Pi session dir created with mode 0o700 + chmod existing - #208 — extend stderr sanitisation to /Users, /private/var, /tmp, /var, /opt, /srv, /etc, /usr/local, /app, /workspace, /root - #211 — replace stat()+read_bytes() with capped streaming read in anyio worker thread; closes TOCTOU window on /file get - #213 — add OPENAI_PROJECT_KEY_RE for sk-proj-... redaction (the underscore/hyphen char set is not covered by the generic sk- pattern) - #402 — bump Pygments 2.19.2 → 2.20.0 via uv lock (CVE-2026-4539 ReDoS, transitive) - #403 — replace 123456789:ABCdef… placeholder bot tokens with <BOT_ID>:<BOT_TOKEN> in non-test paths (onboarding.py, install.md, llms-full.txt); test fixtures kept as-is for GitHub-UI dismissal All 2410 tests pass; ruff check + format clean; uv lock --check ok. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ci: silence bandit B108 false positive + ignore CVE-2026-3219 - bandit B108 fires on the new /tmp/ regex pattern in _PATH_PATTERNS at runner.py — regex for stderr redaction, not a hardcoded temp-file write. Suppressed with `# nosec B108` matching the existing render.py:111 pattern. - pip-audit now flags pip 26.0.1 → CVE-2026-3219 (advisory published recently; no fix available upstream). Added to the --ignore-vuln list alongside CVE-2026-4539 (pygments — kept for posterity even though #402 lockfile bump fixed it). No source/test code changes. CI-only. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) `_daily_cost` is a module-level tuple updated via read-modify-write in record_run_cost(). Concurrent finalize_run callers could both read (today, X), both write (today, X + cost), and lose one run's cost — letting a malicious or runaway concurrent workload defeat the per-day budget gate. Fix: wrap the RMW block in a `threading.Lock`. Critical section is a single tuple assignment (sub-microsecond), so the lock is fine under both async (cooperative) and threaded callers without an async-signature ripple. get_daily_cost() also acquires the lock for snapshot consistency. Trade-off note: kept the function sync rather than pivoting to `anyio.Lock` because that would require updating the 6 sync test call sites and the 1 sync caller in runner_bridge.py — needless churn for a sub-microsecond critical section. Test: new ThreadPoolExecutor-driven fuzz test (16 workers, 200 calls) asserts the observed total equals n * unit_cost — would fail under racing RMW. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Brings the voice transcription API key into parity with `bot_token` (closed #196): SecretStr masks the value in repr()/str()/tracebacks and any accidental structlog serialisation. Access the raw value via `.get_secret_value()` at the transport boundary. Changes: - `settings.py`: field type `NonEmptyStr | None` → `SecretStr | None`; new `_validate_voice_key_not_empty` validator preserves the prior no-empty-string contract by round-tripping `""`/whitespace to None - `telegram/bridge.py`: `TelegramBridgeConfig.voice_transcription_api_key` annotation → `SecretStr | None`; `update_from()` unchanged (assigns SecretStr to SecretStr) - `telegram/loop.py:2208`: sole unwrap point — call `.get_secret_value()` only when non-None before passing to `transcribe_voice` (OpenAI SDK still wants raw `str | None`) - `telegram/voice.py`: unchanged; boundary stays at the loop caller Tests: - `test_settings.py`: new `test_voice_transcription_api_key_is_secret_str` (round-trip + repr/str masking), `_empty_string_normalised_to_none` (whitespace → None), `_default_none` (omitted → None) - `test_bridge_config_reload.py`: hot-reload tests updated to use `.get_secret_value()` for value comparison - `test_telegram_backend.py`: updated build_and_run assertion All 2413 tests pass; ruff check + format clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bump rc1 → rc2 to publish a fresh staging wheel that includes: - #431 — Group 1A security hygiene (8 issues: #205, #206, #207, #208, #211, #213, #402, #403) - #432 — #379 daily cost tracker race (threading.Lock guard) - #433 — #378 voice_transcription_api_key SecretStr rc1 (b6c6ad6) only carried #407 (Claude extra_args). rc2 supersedes it on TestPyPI. No CHANGELOG entry — per release-discipline.md §"Staging / rc versions", entries batch into the stable bump. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ult (#409) (#435) Self-installed Untether users in heterogeneous environments need to thread credential-manager tokens (1Password, Doppler, Vault, Infisical, …) into engine subprocesses. Today the env allowlist is hard-coded in `utils/env_policy.py` so adding a single var requires a fork + release. Changes: - `utils/env_policy.py`: - new `is_allowed_with_extras(name, extra_exact=, extra_prefix=)` - `filtered_env()` extended with `extra_prefix=` parameter - new `log_user_extensions_once()` — module-level latch emits one `env_policy.user_extension` INFO per process when user extras are active, so the operator sees the addition in journalctl - `settings.py` `SecuritySettings`: - `env_extra_allow: list[str]` (default `[]`) - `env_extra_prefix_allow: list[str]` (default `[]`) - field validators reject empty/whitespace and enforce `[A-Z_][A-Z0-9_]*` - `runners/claude.py`, `runners/pi.py`: - new `_load_env_extras()` helper (best-effort settings load — never blocks a run on a config error, mirrors the env_audit pattern) - threads extras through `filtered_env()` + `log_user_extensions_once()` - `utils/env_audit.py` `audit_proc_env()`: - new `user_extra_exact=`/`user_extra_prefix=` params so user-allowed names aren't false-flagged as `claude.env_audit.leaked_var` - Built-in defaults: `BWS_ACCESS_TOKEN` promoted into `_EXACT_ALLOW` (Bitwarden Secrets Manager — common enough to ship as a default). - Docs: `docs/reference/config.md` `[security]` table, CLAUDE.md features list. Tests: +19 across `tests/test_env_policy.py` (8 user-extension cases + log latch), `tests/test_env_audit.py` (4 user-extras cases), and `tests/test_settings.py` (7 round-trip + validator cases). `uv run pytest` → 2432 passed, 2 skipped; ruff clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bump rc2 → rc3 to publish a fresh staging wheel that includes #435. Cumulative since rc1: - #431 — Group 1A security hygiene (8 issues: #205, #206, #207, #208, #211, #213, #402, #403) - #432 — #379 daily cost tracker race (threading.Lock guard) - #433 — #378 voice_transcription_api_key SecretStr - #435 — #409 user-extensible env allowlist + BWS_ACCESS_TOKEN default No CHANGELOG entry — per release-discipline.md §"Staging / rc versions", entries batch into the stable bump. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
) (#437) #377 fix: - `TelegramTransportSettings` gains `allow_any_user: bool = False` (opt-in escape hatch) and `_validate_allowed_user_ids_or_optin` model_validator raising ValueError when `allowed_user_ids == []` and `allow_any_user is False`. Pre-v0.35.3 the empty default silently shipped open bots — this is the v0.35.3 promotion of the warning to a hard ConfigError. - `TelegramBridgeConfig` and `update_from()` carry the new field through hot-reload; backend constructs with the value. - `telegram/loop.py` drops the per-update `security.no_allowed_users` warning (validator now blocks startup) and emits `security.allow_any_user` INFO every boot when the opt-out is in effect. - `config_migrations.py` `_migrate_legacy_telegram` relocates a top-level `allow_any_user` key into `[transports.telegram]` alongside `bot_token` / `chat_id` so legacy configs migrate cleanly. CHANGELOG: backfilled `## v0.35.3 (unreleased)` with `### breaking`, `### changes`, `### fixes` subsections covering all 13 issues that shipped in rc1-rc4 (#205, #206, #207, #208, #211, #213, #377, #378, #379, #402, #403, #407, #409). Per release-discipline.md the section heading stays `(unreleased)` until the dev → master stable bump populates the date. Docs sweep: - `docs/how-to/security.md` — required-allowlist wording, dev/demo opt-out callout, env_extra_allow / env_extra_prefix_allow extension guide, sk-proj redaction note, voice-key SecretStr note. - `docs/how-to/troubleshooting.md` — new top-of-page section for `allowed_user_ids is empty` startup error. - `docs/how-to/group-chat.md` — required wording. - `docs/how-to/operations.md` — `env_extra_allow` + `allow_any_user` added to hot-reloadable list. - `docs/tutorials/install.md` — `allowed_user_ids` added to all three example configs (assistant / workspace / handoff). - `docs/reference/config.md` — `allow_any_user` row added, `allowed_user_ids` flipped to required, AMP `dangerously_allow_all` default note flipped to `false`. - `docs/reference/runners/amp/runner.md` — flag is now optional; `dangerously_allow_all = false` example. - `docs/reference/env-vars.md` — `BWS_ACCESS_TOKEN` default mention, `[security] env_extra_*` extension subsection. Test fixtures: - ~30 test fixtures across `test_settings`, `test_cli_*`, `test_projects_config`, `test_telegram_backend`, `test_bridge_config_reload`, `test_config_watch`, `test_config_path_env`, `test_onboarding*`, `test_runtime_loader`, `test_settings_contract`, `test_exec_bridge` patched to add `allow_any_user = true` (or `"allow_any_user": True`) where the fixture exercises non-allowlist behaviour. Tests that specifically cover #377 use `populated allowlist` cases. #377 tests: 4 new in `test_settings.py` covering block + opt-out + populated + both-set. GitHub housekeeping (parallel to this commit, not in the diff): - Closed #205, #206, #207, #208, #211, #213, #378, #379, #402, #403, #409 with implementation references. #377 closes via this PR's body. Version: 0.35.3rc3 → 0.35.3rc4 (`pyproject.toml`, `uv.lock`). Verification: 2436 tests pass / 2 skipped (~68s). Ruff check + format clean. uv lock --check in sync. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the literal "Basic dXNlcjpwYXNz" string in test_malformed_bearer_header with a runtime-constructed header so GitHub's secret-scanner stops flagging it. The test still asserts verify_auth rejects Basic auth — Untether webhooks only accept Bearer + HMAC. The corresponding GitHub secret-scanning alert is a true false positive (test fixture, not a real credential) and will be dismissed in the GitHub UI as "Used in tests / false positive". Closes #404
…-approve safety (#380) (#442) The 2026-04-20 audit (§ASI02) flagged ``ControlRewindFilesRequest`` and ``ControlMcpMessageRequest`` as worth a deeper look because rewind could in principle undo state that drove a prior denial decision and MCP messages could carry tainted payloads from a compromised MCP server. Audit verdict: both are safe to auto-approve under the current upstream Claude Code 2.1.x trust model. - mcp_message: Untether is a transport pass-through; the message payload is opaque storage and is never inspected, executed, or rendered. A compromised MCP server is the inherent threat model of any MCP server, not specific to auto-approve. Routing this through Telegram approval would not block the payload. - rewind_files: rewind is user-initiated upstream (the model cannot trigger it autonomously). Untether's per-session approval state (_PLAN_EXIT_APPROVED, _DISCUSS_APPROVED, _HANDLED_REQUESTS) is NOT mutated by rewind. Subsequent writes still pass through the standard ControlCanUseToolRequest gate. No code change beyond: 1. Multi-paragraph safety-invariant comment in src/untether/runners/claude.py near _AUTO_APPROVE_TYPES, including the re-audit trigger (upstream semantic change to either subtype). 2. 3 regression-lock tests in tests/test_claude_control.py::TestAutoApproveSafetyInvariant that fail loudly if the auto-approve path starts inspecting payloads or coupling to per-session approval state. 3. Audit memo at docs/audits/2026-04-27-380-auto-approve-scope-review.md. Closes #380 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… (#440) The chat-level message-routing command (`all` / `mentions` / `clear`) shared a name with the unrelated webhook/cron triggers system, which became increasingly confusing as `/config` grew separate trigger pages. User-visible changes: - New `/listen` command (`all`/`mentions`/`clear`) replaces `/trigger` - `/trigger` continues to work as a deprecated alias for one release cycle and prepends a one-line deprecation notice - `/config → 📡 Listen` page replaces `📡 Trigger` - Home page summary renders `Listen: all` instead of `Trigger: all` - Bot command menu lists `listen` instead of `trigger` Internal renames: - `telegram/trigger_mode.py` → `telegram/listen_mode.py` - `commands/trigger.py` → `commands/listen.py` - Type `TriggerMode` → `ListenMode` - Function `resolve_trigger_mode` → `resolve_listen_mode` - ChatPrefsStore / TopicStateStore: new `*_listen_mode` methods; legacy `*_trigger_mode` methods preserved as one-release aliases Storage: msgspec field is still named `trigger_mode` for backward compat with existing `telegram_chat_prefs_state.json` / `telegram_topics_state.json` files. No migration is needed. Tests: full suite passes (2438 passed, 2 skipped). Two new tests in test_telegram_agent_trigger_commands.py cover the deprecation prefix and clean `/listen` output. test_config_command toast expectations updated to "Listen: ...". Closes #297 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a global pause control for the trigger system (crons + webhooks)
accessible via /config in Telegram. During pause:
- Cron scheduler skips its tick — run_once crons are NOT consumed and
fire on the next matching tick after resume
- Webhook server returns 503 (with Retry-After: 60) instead of
dispatching, so external monitors can distinguish paused-but-up from
healthy. Returns 404 for unknown paths as before
- /health endpoint surfaces {"status":"paused","paused":true}
Pause is in-memory only — restart auto-resumes. This is the safe
default per the issue's recommendation, and mirrors /at scheduler
behaviour.
UI:
- New /config home-page row "⏸ Pause triggers" / "▶️ Resume triggers"
appears only when triggers are configured
- New dedicated "📡 Triggers" page (config:tg) showing state + counts
with Pause/Resume button; gracefully handles no-trigger-manager
and zero-config cases
- /ping shows "⏸ triggers paused: … (suspended)" indicator while paused
Tests: 15 new tests across test_trigger_manager.py (8 pause toggle
behaviours including 503 webhook check), test_ping_command.py
(2 paused/resumed indicators), and test_config_command.py
(5 TestTriggersPage covering unavailable/empty/pause/resume/toast).
Full suite: 2445 passed, 2 skipped.
Closes #294
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…fication (#438) (#443) Adds [watchdog] claude_stream_idle_timeout_ms (default 300_000 ms, range 30 s – 30 min) so deployments hitting upstream Anthropic API stalls on long opus 4.7 1M plan-mode generations can raise the watchdog without forking the codebase. Untether's Claude runner reads the value via setdefault — shell-set CLAUDE_STREAM_IDLE_TIMEOUT_MS still wins. Settings load failure falls back to the hardcoded 300_000 default with a debug log entry. Type-A vs Type-B classification on the failure message: - Type A — mid-generation stall (num_turns >= 1 && duration_api_ms > 0). Often legitimate long opus reasoning that exceeded the watchdog. Inline hint suggests raising the new config knob. - Type B — cold-start zero-byte stall (num_turns <= 1 && duration_api_ms == 0). Upstream API outage — raising the timeout will NOT help. Inline message says so explicitly. Auto-retry on Stream idle timeout deferred to v0.35.4 pending upstream Anthropic stabilisation (8 duplicate api:anthropic issues filed 2026-04-17→26 across macOS/Windows/web/WSL). Tests: 5 new tests in test_claude_runner.py. Full suite 2460 passed, 2 skipped. Lint clean. Closes #438 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…410) (#444) Promotes claude_usage.schema_mismatch from one-shot per-process to per-call counter so the issue-watcher catches ongoing API-shape drift instead of just the first hit. Structured event carries a cumulative `count` field; new runner_bridge.get_usage_schema_mismatch_count() exposes the counter for the debug page. UsageCacheStats added to utils/usage_cache.py tracking last successful fetch wall time, cache age, last-error class+message; populated on every fetch path including stale-while-error fallbacks. _read_token_expiry_ms() added to telegram/commands/usage.py so the OAuth token expiry can be surfaced without raising on missing credentials (best-effort: returns None on any read failure). /usage debug appends a 🔧 debug block (HTML) showing: - last successful fetch (UTC ISO + age + fresh/stale label) - last error (class + message, 120-char truncated) - OAuth token expiry (with hh/mm remaining) - cumulative schema-mismatch counter Operator-facing signal so the next time the subscription footer goes silent, the root cause is visible without grepping journalctl. Tests: 5 new in test_usage_cache.py::TestCacheStatsObservability; 1 in test_command_engine_gates.py::TestUsageDebugMode; existing test_schema_mismatch_warning_fires_once repurposed to assert per-call firing with cumulative counts. Full suite: 2465 passed, 2 skipped. Closes #410 Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…n + last-fired history + /stats breakdown (#271) (#445) Tier 2: `/config → ⏰ Triggers` now lists every cron and webhook configured for the current chat. Crons render as `id · describe_cron(...) · proj · eng · last X` and webhooks as `id · path · auth · proj · eng · last X`. Lists are scoped via `crons_for_chat`/`webhooks_for_chat` with the bridge default_chat_id fallback, capped at 10 entries with an overflow marker, and omitted when the chat has no triggers (pause/resume controls remain regardless). Tier 3: new `triggers/history.py` JSON store at `<config_path>.with_name("triggers_history.json")`. Records `time.time()` after every successful cron dispatch (cron.py:130) and webhook dispatch (dispatcher.py:dispatch_webhook + dispatch_action). Recording is best-effort — OSError writes log `triggers.history.write_failed` and swallow. `/stats` appends `(N triggered, M manual)` per engine line and on the totals row when at least one count > 0. `DayBucket`/`AggregatedStats` carry additive `triggered_count`/`manual_count` with `.get(..., 0)` fallbacks so existing stats.json files load cleanly. `runner_bridge.handle_message` resolves the split via `triggered=bool(context and context.trigger_source)`. 28 new tests: 10 in test_triggers_history.py (round-trip, corrupt JSON, version mismatch, persistence), 7 in test_session_stats.py (triggered/manual split, back-compat with old format), 3 in test_stats_command.py (breakdown present/omitted/totals), 7 in test_config_command.py::TestTriggersPagePerChat (crons listed, webhooks listed, chat filtering, default_chat_id fallback, last-fired rendering, overflow cap), 2 in test_trigger_cron.py (cron firing records last_fired + history failure resilience), 2 in test_trigger_dispatcher.py (webhook records last_fired + history failure resilience). Full suite: 2496 passed, coverage 82.18%. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…) (#446) After a Claude bidirectional session emits `result`, the CLI keeps stdin open so multi-turn sessions don't re-spawn. In practice this leaves a 400 MB RSS subprocess + ~200 TCP sockets idling for 30+ minutes between prompts, and from the user's perspective the session looks "stuck" — final message rendered, no further indication of state. Option D hybrid: - New `[watchdog].post_result_idle_enabled = true` (kill switch) and `[watchdog].post_result_idle_timeout = 600.0` (30s–1h) in settings. - `ClaudeStreamState.result_received_at` armed by `translate_claude_event` on every `StreamResultMessage` (re-armed per turn so multi-turn works). - New `ClaudeRunner._post_result_idle_watchdog` task runs in the existing `run_impl` task group when `use_control_channel` is True. Polls the timer; when the deadline passes, calls `this_proc_stdin.aclose()` (same mechanism as the normal-flow exit at line 2412, just earlier). CLI hits stdin EOF and exits gracefully (rc=0). - Auto-continue safety: the existing `_should_auto_continue` gate excludes `last_event_type == "result"` (locked by `test_skips_result_event_type` in test_exec_bridge.py), so the clean rc=0 exit will not phantom-resume the session. - Approval-state guard: if `_REQUEST_TO_SESSION` or `_PENDING_ASK_REQUESTS` has live entries for this session, defer the close (re-arm the timer) to avoid orphaning a button-click control_response in flight. UX hint #1: a supplementary `StartedEvent` with `meta={"complete": "✓ turn complete"}` is emitted alongside `CompletedEvent` on successful results (the supported pattern for late-arriving meta per runner-development.md). `markdown.format_meta_line` renders it in the footer so the user sees the turn boundary immediately. Errored results don't get the hint (no false "complete" tag on a failure). Two structlog events for ops: - `claude.post_result_idle.deferred` — approval guard suppressed close - `claude.post_result_idle.closing_stdin` — deadline passed, stdin closed 7 new tests in test_claude_runner.py: result-event arms timer, emits turn-complete meta, skips meta on error, watchdog fires when clean, watchdog defers when pending approval, format_meta_line renders the hint when present and omits it when absent. Full suite: 2503 passed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#447) Closes #269. The four settings groups in the issue had different states: - [footer]: already loads fresh per-message via _load_footer_settings (no work) - [cost]: already loads fresh per-call inside _check_cost_budget (no work) - [watchdog]: already loads fresh per-run via _load_watchdog_settings at the top of handle_message (no work — verified, applies on next run) - [progress]: was baked in at startup via MarkdownFormatter constructor + ExecBridgeConfig.min_render_interval — this PR closes that gap Changes: - markdown.py: new MarkdownFormatter.refresh_from(progress_settings) updates max_actions + verbosity from a fresh ProgressSettings snapshot. Tolerates missing/invalid attributes (clamps negative max_actions to 0; ignores unknown verbosity values). - telegram/bridge.py: new TelegramPresenter.refresh_progress_settings() delegates to formatter.refresh_from. - runner_bridge.py: new _load_progress_settings() sibling of _load_footer_settings / _load_watchdog_settings; handle_message reads it fresh per-run, calls cfg.presenter.refresh_progress_settings(...) via duck-typed getattr (Presenter is a Protocol, so we don't add to it), and threads progress_cfg.min_render_interval into each ProgressEdits instance instead of the startup snapshot. Per-chat /verbose overrides downstream of _resolve_presenter reconstruct from the refreshed defaults. Out of scope (entry-point limitation): engine + command registration still require pipx upgrade / restart. Documented on the issue. 8 new tests in tests/test_meta_line.py: TestMarkdownFormatterRefresh covers max_actions update, verbosity update, negative clamp, invalid-verbosity rejection, missing-attribute tolerance, presenter delegation. Plus _load_progress_settings defaults / error-fallback. Full suite: 2511 passed. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
All 9 v0.35.3 Group 2 issues now landed on dev: - #404 — secret-scanning alert (PR #439) - #297 — /trigger → /listen rename + alias (PR #440) - #294 — master trigger pause/resume toggle (PR #441) - #380 — auto-approve scope review (PR #442) - #438 — claude_stream_idle_timeout_ms + Type-A/B classification (PR #443) - #410 — subscription usage observability + /usage debug (PR #444) - #271 — trigger visibility Tier 2 + Tier 3 (PR #445) - #333 — Claude post-result idle timeout + ✓ turn complete UX hint (PR #446) - #269 — hot-reload [progress] settings (PR #447) Bumps to TestPyPI for staging via @hetz_lba1_bot once integration tests U1-U7 pass against @untether_dev_bot. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps [dependabot/fetch-metadata](https://github.com/dependabot/fetch-metadata) from 2.5.0 to 3.1.0. - [Release notes](https://github.com/dependabot/fetch-metadata/releases) - [Commits](dependabot/fetch-metadata@21025c7...25dd0e3) --- updated-dependencies: - dependency-name: dependabot/fetch-metadata dependency-version: 3.1.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.4.0 to 8.1.0. - [Release notes](https://github.com/astral-sh/setup-uv/releases) - [Commits](astral-sh/setup-uv@6ee6290...0880764) --- updated-dependencies: - dependency-name: astral-sh/setup-uv dependency-version: 8.1.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 7.0.0 to 7.0.1. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](actions/upload-artifact@bbbca2d...043fb46) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-version: 7.0.1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.32.6 to 4.35.2. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@820e316...95e58e9) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.35.2 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
#471 + #271) (#472) * fix(at): stamp at:<token> trigger_source on /at-scheduled runs (#271) Mirror the cron:<id> / webhook:<id> footer markers added in #271 (rc4) and Tier 2/3 (rc5) so /at-scheduled runs also show provenance. at_scheduler.schedule_delayed_run wraps the captured chat context (or a fresh RunContext when the chat is unmapped) with trigger_source = "at:<token>" via dataclasses.replace. runner_bridge.handle_message's icon-prefix tuple extends from ("cron:",) to ("cron:", "at:") so the alarm-clock icon renders for both — semantically /at is a one-shot delayed cron. record_run's existing triggered=bool(context and context.trigger_source) gate picks up /at runs in the /stats triggered/manual breakdown automatically. Tests: 1 new in test_at_command.py (test_handle_stamps_trigger_source_on_mapped_chat); the existing test_handle_captures_global_default_when_unmapped extended to assert the trigger_source-only RunContext path; existing test_run_delayed_forwards_captured_context_and_engine updated since the captured context is no longer reference-equal to the original. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(gemini): pass --skip-trust by default for headless runs (#471) Gemini CLI rejects runs from any directory not in ~/.gemini/trustedFolders.json — even with --approval-mode yolo — and there is no interactive prompt path in headless usage, so projects outside the trust list silently failed before any agent output. Untether already runs Gemini with yolo for the same "always headless" reason, so passing --skip-trust extends the same precedent. GeminiRunner.skip_trust (default True) is the runtime switch; opt out per deployment with [gemini] skip_trust = false in untether.toml (security-conscious operators who want Gemini's project-local extension/MCP trust gate enforced). Bump to 0.35.3rc6 for staging. Tests: 2 new in test_build_args.py::TestGeminiBuildArgs (test_skip_trust_default_includes_flag, test_skip_trust_opt_out_omits_flag). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sing feature coverage (#473) Audited every issue in the v0.35.3 milestone (26 issues) against the full repo documentation surface and closed the gaps. Reference issues covered: #205, #206, #207, #208, #211, #213, #269, #271, #294, #297, #333, #377, #378, #379, #380, #402, #403, #407, #409, #410, #438, #471. CHANGELOG.md - Added missing entry for #297 (/trigger → /listen rename) under ### changes. The other "milestone" issues (#224, #228, #239) were closed against v0.35.3 for tracking only — their fixes shipped in v0.35.0/v0.35.1rc2; per the repo's "no retroactive edits to prior sections" rule, they remain undocumented in CHANGELOG (closure comments cite the actual versions). /trigger → /listen rename sweep (#297) - README.md: command table row, group-chat link - docs/reference/commands-and-directives.md: command row - docs/reference/transports/telegram.md: command list + admin note - docs/reference/integration-testing.md: O3 + Q12 test rows - docs/explanation/routing-and-sessions.md: pre-routing filter section Runner specs - gemini/runner.md: --skip-trust default + opt-out via [gemini] skip_trust = false (#471) - claude/runner.md: post-result idle watchdog + "✓ turn complete" meta hint (#333), claude_stream_idle_timeout_ms config + Type-A/B classifier (#438) How-to guides - schedule-tasks.md: trigger provenance + history + /stats triggered/manual breakdown (#271 Tier 3); master pause/resume toggle (#294) - inline-settings.md: new Triggers page (#271 Tier 2 + #294) - troubleshooting.md: Type-A/B stream idle classification (#438); post-result idle watchdog + ✓ turn complete (#333) - security.md: extended path-redaction coverage (#208); Pi session dirs 0o700 (#207) - subscription-usage.md: /usage debug section (#410) - operations.md: pause status surfacing in /health (#294); /usage debug cross-link (#410); expanded hot-reload list to include [progress] (#269), [watchdog] (#333, #438), [footer], [cost] README.md - Scheduled tasks bullet: pause/resume toggle (#294); footer provenance markers (#271 Tier 3); /stats triggered/manual split - Inline settings bullet: 📡 Triggers page (#271, #294) - Commands table: /usage debug (#410); /listen (#297); /config Triggers page row Verified clean: - python3 scripts/validate_release.py (rc6 pre-release) - grep -rnE "/trigger\\b" docs/ README.md returns zero non-deprecation hits in production docs (test plans and historical results retain /trigger by design) - Cross-references resolve to existing anchors Plan: ~/.claude/plans/untether-you-are-running-rustling-shannon.md (also staged in .untether-outbox/v0.35.3-doc-audit-plan.md) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.35.2 to 4.35.3. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@95e58e9...e46ed2c) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.35.3 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…) + local-context protection (#479) * fix(security): claude runner.start no longer leaks prompt at INFO (#478) The Claude runner's run_impl override at src/untether/runners/claude.py had its own duplicate runner.start log call that was missed when the base runner was fixed for #205. Every Claude session emitted `prompt=prompt[:100] + "…"` at INFO level — leaking the first ~100 chars of the Untether preamble (boilerplate, but spec-violating). Discovered during the v0.35.3 follow-up E2E pass. Fix mirrors the base runner impl: - INFO `runner.start`: only `engine`, `resume`, `prompt_len`, `args` - DEBUG `runner.start_prompt`: preview of first 100 chars (opt-in) Argv redaction also tightened: - env -i KEY=VAL pairs redacted via redact_env_i_args (was already applied at subprocess.spawn but not at runner.start, so e.g. BWS_ACCESS_TOKEN, GEMINI_API_KEY values would land in INFO logs) - Legacy-mode (no permission_mode) `-- <prompt>` tail collapsed to `-- <prompt redacted>` so prompt content never reaches INFO under any code path 2 new regression tests cover both control-channel and legacy modes: - test_runner_start_does_not_log_prompt_at_info - test_runner_start_redacts_legacy_mode_prompt_in_args Closes #478. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(faq): add docs/faq/index.md for help-centre FAQPage schema (#477) Marketing-site infra (FAQPage extractor on `feature/help-seo-geo-items-1-4` in littlebearapps/littlebearapps.com) already extracts question-shaped H2s and emits Schema.org FAQPage JSON-LD on any help article with `category: faq` frontmatter or ≥3 question-shaped H2s. No tool currently has a dedicated FAQ scaffold; this commit closes the loop for Untether. The new file lives at docs/faq/index.md (Diátaxis-aligned scaffold — plain title + description frontmatter, marketing-site sync injects category/tool/dates). 12 question-shaped H2s exceed the 7-minimum acceptance criterion: 1. What is Untether? 2. How do I install Untether? 3. Which AI coding agents does Untether support? 4. Do I need an API key to use Untether? 5. Where does my code and data go? 6. How do I approve tool calls from my phone? 7. What happens if my agent crashes or my phone loses signal mid-run? 8. How do I keep agents from spending too much money? 9. Can I send voice notes instead of typing? 10. How do I update Untether? 11. How do I uninstall Untether? 12. Where can I get help or report a bug? Each answer is a complete paragraph (no TODO / placeholder), sourced from README + real common-channel topics. Cross-links to existing help-guide URLs preserve nav chains. Coordinated mapping in `littlebearapps/littlebearapps.com` (`scripts/docs-sync.config.ts` → add `untether` → `docs/faq` → `category: faq`) is a separate one-line PR per the issue's "Coordinated mapping" section. Once both land, the next nightly sync surfaces the FAQ at <https://untether.littlebearapps.com/help/untether/faq/> with a visible `<script type="application/ld+json">` FAQPage block, unlocking AI-citation surface (ChatGPT, Perplexity, Google AI Overviews) and SERP rich-snippet eligibility. Closes #477. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * ctx: protect docs/faq/index.md from deletion + register in local docs (#477) The FAQ doc is part of the marketing-site FAQPage Schema.org pipeline (littlebearapps/littlebearapps.com:scripts/docs-sync.config.ts → untether → category: faq). Removing it silently breaks the docs-sync mapping and regresses AI-citation surface. This commit hardens local Claude Code context so the file: - cannot be silently deleted, moved, or truncated by accident - has explicit guidance on when/how to update it during releases - is registered in CLAUDE.md so future contributors know it exists Changes: * `.claude/hooks/help-faq-protect.sh` (new) — PreToolUse Bash hook blocking `rm`, `git rm`, `mv`-away, and shell `>` truncation targeting `docs/faq/index.md`. Edits via Edit/Write/append `>>` are intentionally allowed — the FAQ is meant to evolve. Smoke-tested with 7 synthetic inputs covering both deny and allow paths. * `.claude/hooks/release-guard-protect.sh` (updated) — also protects `help-faq-protect.sh` from being weakened or removed via Edit/Write. * `.claude/hooks.json` (updated) — - registers help-faq-protect.sh under PreToolUse Bash - extends the existing Edit/Write context-prompt with a docs/faq/* branch (HELP-FAQ CONTEXT) reminding contributors of question-shape rules and the maintain-as-features-land cadence - extends the version-bump-checklist (PostToolUse) with an FAQ touch-up step * `.claude/rules/help-faq.md` (new) — auto-loads when editing `docs/faq/**`. Documents the hard rules (NEVER delete; MUST update with feature changes), soft conventions (question-shaped H2, ≥7 Q/A, real behaviour not aspirational), and the release-cadence workflow. * `.claude/rules/release-discipline.md` (updated) — adds an FAQ touch-up step to the version-bump checklist. * `CLAUDE.md` (updated) — - new "Help-centre FAQ" section after "Documentation screenshots" explaining the file's role and the no-deletion rule - Hooks table registers `help-faq-protect` - Rules table registers `help-faq.md` Refs #477. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Bumps pre-release version so TestPyPI can publish a fresh wheel that includes the v0.35.3 follow-up bundle merged via PR #479: - fix(security): claude runner.start no longer leaks prompt at INFO (#478) - docs(faq): add docs/faq/index.md for help-centre FAQPage schema (#477) - ctx: protect docs/faq/index.md from deletion + register in local docs (#477) The rc6 wheel on TestPyPI predates this work — without the bump the publish step skips ("File already exists") and the staging upgrade path keeps installing the older wheel. Per release-discipline.md, pre-release versions don't require a CHANGELOG entry (validate_release.py skips them) and aren't tagged (auto-tag-on-master.yml skips pre-releases). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
#481) (#484) Two coordinated fixes that share the same `progress_edits.stall_detected` decision branch in `runner_bridge.py`. Reproduction: a 45-min Claude session on staging looked hung — 10-min Cloudflare deploy poll + 14-min approval-keyboard wait kept the chat silent, then surfaced unhelpful stall warnings during legitimate waits. #470 — Post-result stall suppression + closing message - New `progress_edits.stall_post_result_suppressed` info log when `stream.last_event_type == "result"` and the post-result idle watchdog (#333) is the legitimate owner of the silence - Auto-cancel `_STALL_MAX_WARNINGS` arm gated by the same boolean — no more SIGTERM'ing sessions that are about to gracefully close - Watchdog stamps `ClaudeStreamState.post_result_closed_at` before `aclose()`; bridge's heartbeat tick sends a one-shot `✓ turn complete · session closed after Nm idle` message (idempotency via `post_result_closing_sent` flag) #481 — Long-tool visibility + suppression matrix - New `[progress] heartbeat_interval` (default 30 s) drives a tick inside `_stall_monitor` that bumps `event_seq` whenever any open action is older than 60 s, forcing a re-render with a fresh elapsed-time tail - `format_action_line` gained `elapsed_seconds` kwarg; non-completed actions > 60 s render as `▸ Bash · 3m 47s · npm run build`, regardless of `/verbose` toggle - `format_verbose_detail` gained `BashOutput` (renders last line of `result_preview` so polling loops show live stdout), `KillShell`, `ScheduleWakeup` (countdown + reason), and `Monitor` (countdown) branches - `ActionState` gained `started_at` / `last_update_at` wall-clock fields populated from the new `ProgressTracker.clock` callable - `MarkdownFormatter.render_progress_parts` / `MarkdownPresenter` / `Presenter` Protocol / `TelegramPresenter` all gained `now: float | None` threaded from `runner_bridge._run_loop` - New `format_duration` / `format_countdown` helpers - Five new suppression branches in `_stall_monitor`, gated by `not frozen_escalate` so genuinely-frozen sessions still warn: - stall_post_result_suppressed (#470) - stall_schedule_wakeup_suppressed (engine_state.live_wakeups) - stall_monitor_active_suppressed (engine_state.live_monitors) - stall_bash_grace_suppressed (new `[watchdog] bash_grace_seconds`, default 60 s) - stall_long_bash_suppressed (BashOutput within stall_threshold/2) Bonus fix: `_register_background_handle` now reads `delaySeconds` first (per upstream Claude Code schema, #289) instead of only `delay_ms` — production deadlines were always 0.0, breaking countdown rendering. Backward-compat fallback to `delay_ms`/`timeout_ms` preserved. structlog WARN events at runner.py and runner_bridge.py are unchanged so untether-issue-watcher and ops dashboards continue to receive the underlying signals — only the chat-side surfacing decision changed. Tests: 32 new (11 in test_exec_bridge.py for suppression branches, auto-cancel gating, frozen-ring precedence, closing-message idempotency, heartbeat countdown mutation; 3 in test_claude_runner.py for delaySeconds + post-result state init; 18 in test_verbose_progress.py for new tool detail branches, format_duration helpers, long-running tail). Full suite: 2548 passed, 82.26% coverage. Integration tests: U3 (basic Claude Code) passes cleanly via @untether_dev_bot — 33 s run, zero stall warnings, "✓ turn complete" footer rendered. Long-running BashOutput-polling and 30-min genuinely-frozen tests deferred to staging dogfood. Out of scope / known constraints: - Strict 5 s rolling Bash stdout sub-line is not achievable without upstream Claude Code interim tool_result deltas. The BashOutput polling path is the proxy and refreshes at each polling cycle (~15 s in practice). - ScheduleWakeup countdown rendering depends on #289 (`/loop` interception) for the timer to actually fire; suppression of stall warnings while a wakeup is pending works today. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(loop): add LoopSettings + EngineOverrides.loop_enabled (#289) Foundation for /loop and ScheduleWakeup support — Untether-side observation of Claude Code's session-scoped scheduling tools so loops keep firing after the subprocess exits. Default OFF — opt-in per-chat via /config → 🔁 Loop mode. src/untether/settings.py — new LoopSettings model: enabled (default false), inline_threshold_seconds (300), redundancy_check_interval (30), max_iterations (20), max_total_duration_hours (4), min_interval_seconds (60), expiry_days (7). Cost limits stay in [cost_budget] — the caps in [loop] are runaway-safety only. src/untether/telegram/engine_overrides.py — new loop_enabled field on EngineOverrides struct, threaded through normalize_overrides() and merge_overrides() following the existing budget_enabled pattern. LOOP_SUPPORTED_ENGINES = frozenset({"claude"}) — Claude-only since other engines don't expose CronCreate / ScheduleWakeup. Tests: 7 new in test_settings.py (defaults, TOML round-trip, bounds, unknown-key rejection); 5 new in test_telegram_engine_overrides.py (default None, merge topic/chat priority, ChatPrefsStore round-trip, LOOP_SUPPORTED_ENGINES constant). 76 tests pass across the changed files. Empirical pre-work in this session: Probe 4 + 4b — hanging tool_use(AskUserQuestion) does NOT cause catastrophic resume behaviour; outcome (c) confirmed. Drops the consensus-mandated interactive-state gate from PR1 scope. Probe 5 — CronCreate uses field "cron" (not "cron_expression"); CronDelete takes id; CronList renders one entry per line as "<8hex> — <human-schedule> (recurring|one-off) [session-only]: <prompt>". Dispatcher rename — Telegram management surface will be /loops (PLURAL) so /loop (singular) keeps passing through to Claude; the dispatcher in telegram/loop.py:2256–2300 matches first-word only and either fully intercepts or never. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(loop): add is_session_alive helper to claude runner (#289) loop_scheduler._fire (PR1) needs a cheap "is the subprocess for this session_id currently running?" check before firing a loop iteration. Spawning claude --resume against an alive subprocess would race the in-flight turn and almost certainly violate session locking. src/untether/runners/claude.py — new module-level is_session_alive(sid) that reads membership of the existing _SESSION_STDIN registry. The registry is populated when a runner spawns its subprocess and cleared in the run_impl finally block, so membership is the canonical signal of "subprocess is up right now." Tests: 2 in test_claude_runner.py (membership round-trip with cleanup, unknown session returns False). * feat(loop): add loop_scheduler module with persistence + tests (#289) Untether-side scheduler for /loop and ScheduleWakeup. Mirrors at_scheduler.py shape: 4 install globals + _PENDING dicts + install/ uninstall API. Adds: - _LoopEntry dataclass with fallback_first_user_message (text, not msg id — Gap 4 of the handover) for the <<autonomous-loop-dynamic>> sentinel fallback path. - register_pending_cron / register_pending_wakeup / bind_upstream_id for the observer hooks (wired in a follow-up commit — this commit is foundation only). - cancel_by_token / cancel_by_upstream_id / cancel_pending_for_chat with do-not-resume sentinel write on user cancel. - _fire path with race-avoidance (is_session_alive lazy import), drop-on-busy, max-iterations / max-total-duration / 7-day expiry caps, re-issue prompt wrap "Loop iteration N: ... do the task now; do not summarize old results unless necessary." (Probe 3 + consensus). - Generation counter + cancel_event so old _arm_timer tasks left over from a previous round detect they are stale and bail out instead of double-firing on the new round's scope. - Atomic JSON persistence to active_loops.json (sibling to config) via utils.json_state.atomic_write_json. Restart resilience: past fire_at_wallclock fires immediately (no catch-up multiplier), cancelled entries skipped on reload, do-not-resume sentinel persists. - Cron next-fire computation via existing triggers.cron.cron_matches (5-field expressions, 366-day horizon). 41 unit tests covering: install/uninstall lifecycle, registration (cron + wakeup with sentinel fallback), upstream-ID binding, cancellation paths, inspection helpers, cron parsing edge cases, fire path (cancelled / max-iter / do-not-resume / busy / race-alive / success / sentinel-fallback / one-shot expiry), persistence round-trip, restart resume + skip-cancelled, do-not-resume across restart, corrupt file handling, persistence-disabled mode. Coverage of loop_scheduler.py: 84% (above 80% threshold). NOT WIRED YET — observers in runners/claude.py and drain integration in telegram/loop.py land in subsequent commits per the v0.35.4 PR1 plan. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(loop): observe CronCreate / ScheduleWakeup / CronDelete in claude runner (#289) Wires the loop_scheduler module into the JSONL stream-translation path. Observers run as siblings of (not replacements for) the existing _register_background_handle / _clear_background_handle hooks at lines ~1028 and ~1090. Changes: - src/untether/runners/run_options.py: add `loop_enabled: bool | None` to `EngineRunOptions` so the per-chat /config → 🔁 Loop mode toggle can short-circuit observers via the existing run-options contextvar. - src/untether/telegram/loop.py: plumb `loop_enabled` from merged EngineOverrides into the resolved EngineRunOptions. - src/untether/runners/claude.py: - `ClaudeStreamState.first_user_message_text` (str | None) — populated from the `prompt` arg in `new_state` so loop entries can fall back to it when ScheduleWakeup observes the `<<autonomous-loop-dynamic>>` sentinel (Probe 3 result). - `_loop_enabled_for_chat(chat_id)` — resolves per-chat run-options override → global `[loop] enabled` → False fallback. Sync (no async prefs lookup; the contextvar is set upstream by executor.py). - `_observe_loop_tool_use(state, content)` — handles CronCreate / ScheduleWakeup / CronDelete tool_use blocks. Uses the canonical field names (`cron`, not `cron_expression`; `id`, not `taskId`) confirmed by Probe 5. Skips ScheduleWakeup when `delaySeconds` is at or below `[loop] inline_threshold_seconds` so short waits stay rendered live by the rc8 countdown. - `_observe_loop_tool_result(state, tool_use_id, content)` — parses `\bjob ([0-9a-f]{8})\b` from CronCreate result text and binds the upstream cron ID via `loop_scheduler.bind_upstream_id`. - Calls wired at the existing tool_use / tool_result decode sites inside `translate_claude_event`. Master-toggle gate sits at the top of the observers so OFF behaviour is identical to today. - tests/test_claude_runner.py: new `TestLoopObservation` class (10 tests) covering chat-id-unset no-op, master-toggle off, CronCreate registration, `cron` vs `cron_expression` field precedence, missing prompt rejection, ScheduleWakeup above/below threshold, CronDelete, upstream-ID binding, and `_loop_enabled_for_chat` resolution. Plus one sync test for `first_user_message_text` capture in `new_state`. All 2615 tests pass. Loop_scheduler observer wiring is now live — PR1 still default OFF; per-chat toggle UI lands in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(loop): add /config:loop sub-page + home-page button (#289) The Loop mode toggle is the user-facing master gate for /loop and ScheduleWakeup observation. Default OFF — opt-in per chat with an explicit cost+quota warning before turning ON. - New `_page_loop()` mirroring `_page_planmode()` shape: tri-state per-chat override (On / Off / Clear → fall back to global `[loop] enabled`), HTML body explaining behaviour ON vs OFF, "💰 Set a budget" deeplink to `config:cu` for one-tap budget setup before enabling. - Engine-aware: only renders for `LOOP_SUPPORTED_ENGINES = {claude}`; shows "Only available for Claude Code" message on other engines. - Home page (Claude only): replace the previous Plan-mode + Engine layout to slot in `🔁 Loop mode` next to `📡 Listen`, push `⚙️ Engine & model` next to `🧠 Effort`, and break `ℹ️ About` onto its own row. Codex / OpenCode / Pi / Gemini / AMP home pages are unchanged — no `config:loop` callback rendered. - Toast labels for `loop:on`/`loop:off`/`loop:clr` callbacks so early-answer dispatch shows confirmation immediately. - 7 new tests in `TestLoopMode`: page renders with toggle + cost warning + budget deeplink, hidden for non-Claude, set-on returns home, clear resets per-chat override, no-config-path branch, home-page button visibility (Claude vs Codex). All 240 config_command tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(loop): drain integration + /cancel + /new wiring (#289) Safety-critical wiring so loops survive shutdown cleanly and respond to user-initiated cancellation. - src/untether/telegram/loop.py: - Install `loop_scheduler` immediately after `at_scheduler`. Resolve `state_path` from `cfg.runtime.config_path.with_name("active_loops.json")` so loop state is persisted alongside `last_update_id.json` and `active_progress.json`. - Wire an `is_chat_busy(chat_id)` callable that scans `running_tasks` for refs in the chat — `loop_scheduler._fire` consults it to drop iterations when the chat already has a run in flight (mirrors upstream's "no catch-up" semantic). - Drain integration: `_drain_and_exit` now logs `pending_loops` from `loop_scheduler.active_count()` alongside `pending_at`. The task-group cancel propagates into `_arm_timer` sleeps cleanly via the cancel-event primitive added in Commit A. - src/untether/telegram/commands/cancel.py: - `handle_cancel` now also drops pending /loop entries for the chat when there's no specific reply target. Reports "❌ cancelled N active loops" alongside the existing /at handling. - `cancel_pending_for_chat` writes the do-not-resume sentinel for each cancelled loop's session_id (handover default — block only `loop_scheduler --resume`, NOT `/continue`). - src/untether/telegram/commands/topics.py: - `_cancel_chat_tasks` (called by `/new`) drops loop entries too so the "wipe a chat's state" semantics are complete. All 2622 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(loop): document Loop mode + cost interaction (#289) Five doc files updated as the user-facing surface for Loop mode (default OFF, opt-in per chat). - docs/how-to/schedule-tasks.md: - New intro callout below H1 stating Loop mode is opt-in and pointing to the new section. - New "## Loop mode" section between /at and Telegram scheduling explaining the observe-and-fire-on-resume architecture, runaway caps, cost considerations (cache-warm vs cold per-fire ranges), cancel + persistence semantics. - docs/how-to/cost-budgets.md: - Warning callout after "Per-chat overrides" — loop fires count toward the same daily/per-run caps; set a budget BEFORE turning Loop mode on. - docs/how-to/troubleshooting.md: - New "Loop didn't fire / loop fired too many times" symptom table: toggle off, max_iterations, daily_budget_exceeded, "fresh user turn" expected behaviour, stale active_loops.json, restore failures. - docs/faq/index.md: - New H2 "Does /loop work via Untether?" answering the most-asked expected question. Verifies against .claude/rules/help-faq.md: 13 H2s (above floor of 7), all question-shaped, no TODOs. - docs/reference/config.md: - New `[loop]` section between `[watchdog]` and `[auto_continue]` documenting all 7 config keys plus the explicit "cost limits are NOT in [loop]" pointer to [cost_budget]. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: changelog entry for /loop + ScheduleWakeup support (#289) v0.35.4 (unreleased) entry summarising the multi-commit Loop-mode work landed under #289. Validation passes (pre-release suffix on pyproject.toml means validate_release.py skips the strict checks; the entry is forward-looking for the eventual stable release). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: re-target loop-mode PR to v0.35.3rc9 (#289) Per Nathan's correction — the /loop and ScheduleWakeup work lands inside the v0.35.3 milestone train as the next staging rc (0.35.3rc9), not v0.35.4 as the original handover suggested. Issue #289 was already correctly milestoned to v0.35.3 on GitHub. - pyproject.toml: 0.35.3rc8 → 0.35.3rc9 - uv.lock: re-synced - CHANGELOG.md: fold the loop-mode entries from a forward-looking v0.35.4 (unreleased) block into the existing v0.35.3 (unreleased) block (### changes + ### docs subsections) - docs/how-to/schedule-tasks.md: drop the stray "pre-v0.35.4" version string from the intro callout (use "prior-version baseline" instead so the prose doesn't drift on each rc) No code or test changes — full suite still 2622 passed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: unblock dev CI — ruff SIM300 + new pip CVE ignore Two pre-existing CI failures already on dev's last run (acb6ec0). Both fixes are tiny and unrelated to loop scope: - tests/test_telegram_engine_overrides.py:235 — apply ruff's suggested rewrite of the SIM300 Yoda-condition assertion (semantically identical; literal on the left now). - .github/workflows/ci.yml:210 — add CVE-2026-6357 to the pip-audit ignore list. pip 26.0.1 has the CVE; fix is pip 26.1 which the uv tooling hasn't pulled yet. Sibling of the existing CVE-2026-3219 ignore from the same audit pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lanation (#486) Audited all v0.35.3 user-facing changes against the doc surface and applied succinct updates to fill the gaps left after rc1–rc9. No content rewrites — additive updates only. Reference docs: - config.md: [progress] heartbeat_interval (#481), [watchdog] post_result_idle_timeout / post_result_idle_enabled (#333) + bash_grace_seconds (#481), [gemini] skip_trust (#471), hot-reload tip on [progress] - commands-and-directives.md: heartbeat tail note, Loop-mode pointer - triggers/triggers.md: /config:tg page, /stats triggered/manual, /at footer (⏰ at:<token>), 503 paused response, /health paused, full Pause/Resume section - specification.md: version stamp v0.35.1 → v0.35.3 - runners/claude/stream-json-cheatsheet.md: ScheduleWakeup event shape (delaySeconds, reason, prompt) + CronCreate notes for the Loop observer (#289, #481) - runners/claude/untether-events.md: supplementary StartedEvent with meta={"complete": "✓ turn complete"} after successful result - runners/amp/untether-events.md: example flipped to dangerously_allow_all = false (default since #206) - runners/pi/runner.md: 0o700 session dir mode (#207) How-to: - inline-settings.md: 🔁 Loop mode page section (Claude only, cost+quota warning, 💰 Set a budget deeplink) - verbose-progress.md: long-running tool tail (heartbeat), BashOutput/ScheduleWakeup/Monitor verbose detail, hot-reload tip - webhooks-and-cron.md: full Pause and resume section, /health paused state, 503 triggers paused - troubleshooting.md: post-result closing message + stall suppression note - operations.md: hot-reload list now covers heartbeat_interval + bash_grace_seconds + trigger pause/resume - tutorials/install.md: Gemini --skip-trust headless tip Explanation: - architecture.md: TriggerManager pause/resume + triggers/history.py - module-map.md: at_scheduler.py, loop_scheduler.py, full Triggers module section Top-level: - README.md: 🔁 Autonomous loops feature row - SECURITY.md: v0.35.3 hardening bundle subsection covering #377, #206, #207, #378, #205/#478, #211, #208, #213, #379, #402, #380, #409 No code or test changes. The rename portion of #483 (docs/faq/index.md → docs/faq/faq.md) is deliberately deferred — the FAQ-protect hook (#477) blocks the move and is self-protected by release-guard-protect.sh, so the rename + hooks updates need manual unblock from a human. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes #483. Renames the FAQ file so the help-centre URL becomes `/help/untether/faq/` instead of `/help/untether/index/`. The marketing-site docs-sync derives its slug from the filename, so this rename + the matching mapping update on `littlebearapps/littlebearapps.com` are what produce the cleaner URL. AI-citation surface (ChatGPT, Perplexity, Google AI Overviews) is unaffected — the FAQPage JSON-LD schema is what they read, URL doesn't matter to them. What changed in this PR: - `git mv docs/faq/index.md docs/faq/faq.md` (no content changes) - `.claude/hooks/help-faq-protect.sh` rewritten to protect the new filename (Bash heredoc, since the file is self-protected from Edit/Write — temporarily disabled the hook to do the git mv, rewrote with faq.md content, restored) - `.claude/rules/help-faq.md` — sweep to faq.md - `.claude/rules/release-discipline.md` — sweep - `CHANGELOG.md` — historical reference + #483 link added - `CLAUDE.md` — both occurrences updated; historical reference kept Out of scope (need manual touch from Nathan — Claude Code is blocked by release-guard-protect.sh): - `.claude/hooks.json` — two prompt-text occurrences of `docs/faq/index.md` (lines 49, 69). Non-functional (prompt guidance only), but cosmetically stale. - `.claude/hooks/release-guard-protect.sh` — line 31 deny() error message references `docs/faq/index.md`. Also non-functional, cosmetically stale. Both protected files only carry stale text that has zero effect on hook behaviour. The functional FAQ-protect hook itself now correctly protects `docs/faq/faq.md`. Marketing-site follow-up (separate PR over there): - Update `scripts/docs-sync.config.ts` `untether → docs/faq` mapping to track the new filename. - 301 redirect from `/help/untether/index/` → `/help/untether/faq/` for backlink hygiene. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_use schema gap (#489) (#490) * fix: AskUserQuestion multi-question text-reply path no longer crashes Untether (#488) Observed live on staging (@hetz_lba1_bot, v0.35.2) on 2026-05-08 06:43:11 UTC: unhandled TypeError in route_message kills the entire process when the user answers question 1 of N via the "Other" → text-reply path. The buggy path constructed a RenderedMessage for the next question's option- button keyboard and passed it to a send_plain partial whose text: kwarg expects str, raising: TypeError: sequence item 0: expected str instance, RenderedMessage found inside markdown.assemble_markdown_parts. systemd auto-restarted in ~10s and offset_persistence.py prevented Telegram update loss, but ALL active runs across all chats were lost. Refactor: extract the multi-question continuation logic into a module-level helper send_next_ask_question_message in telegram/commands/ask_question.py that calls transport.send directly with a RenderedMessage carrying HTML parse_mode + inline_keyboard + reply_to / thread_id SendOptions. route_message calls the helper for the text-reply continuation path; the callback-button continuation path (the same file's AskQuestionCommand) still edits in place via ctx.executor.edit (unchanged). Tests: 2 new regressions in tests/test_ask_user_question.py covering the RenderedMessage shape, inline_keyboard presence, and SendOptions thread_id both with and without a forum thread. Full suite: 2624 passed, 82.32% cov. Closes #488 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: claude schema accepts server_tool_use + advisor_tool_result content blocks (#489) Anthropic server-side tools (web_search, code_execution, computer_use, …) emit `server_tool_use` content blocks in routine v2.x sessions; the parent agent's `advisor()` meta-tool emits `advisor_tool_result` blocks. Untether's msgspec schema didn't know either tag, so `decode_stream_json_line` raised ValidationError and the runner silently dropped the entire JSONL line — no progress action in Telegram, no entry in `state.pending_actions`, no input to verbose-mode rendering or cost tracking. Sampling 24h of staging traffic (2026-05-08) showed paired events firing across 5 different projects (auditor-toolkit, scout, brand-copilot, aushistory) and 5 sessions. Schema (src/untether/schemas/claude.py): add `StreamServerToolUseBlock` (mirrors StreamToolUseBlock: id/name/input) and `StreamAdvisorToolResultBlock` (mirrors StreamToolResultBlock: tool_use_id/content/is_error). Extend `StreamContentBlock` union; parent message bodies pick up the new types for free since they reference the union. Translate (src/untether/runners/claude.py): widen the assistant-message match arm so server_tool_use shares the existing tool_use body (_register_background_handle and _observe_loop_tool_use already filter on tool name and no-op cleanly for unrecognised server tools); widen the user-message isinstance check so advisor_tool_result shares the existing tool_result body. No new helpers, no new branches. Tests: 3 schema round-trip tests in test_claude_schema.py (test_decode_server_tool_use_block, test_decode_advisor_tool_result_block, test_decode_advisor_tool_result_block_minimal), 2 translation tests in test_claude_runner.py (test_translate_server_tool_use_block, test_translate_advisor_tool_result_block) covering pending_actions lifecycle and last_tool_use_id stamping. Full suite: 2629 passed, 2 skipped, 82.32% coverage. Closes #489 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: staging 0.35.3rc10 (#488, #489) Bumps the dev branch to 0.35.3rc10 so the TestPyPI publish triggered by this dev push actually publishes (skip-existing would no-op at rc9). Bundles the AskUserQuestion multi-question crash fix (#488) and the server_tool_use / advisor_tool_result schema gap (#489). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(claude-schema): allow tool_result.content to be a single dict (#501) Claude Code emits tool_result and advisor_tool_result content blocks where the inner `content` field is a single object (e.g. `{"type": "text", "text": "..."}`) instead of the documented str / list[dict] / null shapes. msgspec's schema only allowed the documented shapes, so the line was rejected with ValidationError and silently dropped via `jsonl.msgspec.invalid` warning — losing tool tracking for that turn. Add `dict[str, Any]` to the union on both StreamToolResultBlock and StreamAdvisorToolResultBlock. _normalize_tool_result already handles the dict shape, so no runner code change needed. Two new regression tests in test_claude_schema.py cover both block types with dict content. Verified against staging logs (14 occurrences today) and live-tested against @untether_dev_bot — 0 msgspec errors after restart. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(runner): close proc.stderr after reader_done to unblock task group (#502) When a Claude Code subprocess exits cleanly post-result event, the runner's task group can block forever waiting on `drain_stderr`. The cause: MCP server child processes inherit the parent's stderr fd and keep it open. `iter_bytes_lines` then never sees EOF, `drain_stderr` never returns, the task group never exits, and `proc.wait()` is never reached — leaving the watchdog as the only safety net (and the watchdog wrongly marks the session as cancelled/failed despite a clean rc=0). Fix: close the parent's read end of stderr explicitly after `reader_done.set()`. `iter_bytes_lines` already catches the resulting ClosedResourceError and returns from drain_stderr, letting the task group complete and proc.wait() report rc. Applied to both call sites: - src/untether/runner.py (base runner, all engines) - src/untether/runners/claude.py (Claude override has its own block) Verified live on @untether_dev_bot: - subprocess.exit pid=1394555 rc=0 fired immediately after result - session.summary cancelled=False ok=True (was: cancelled=True ok=False in the #502 timeline) - total elapsed 33s vs the 326.7s peak_idle in the bug report Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(settings): demote config.loaded INFO → DEBUG (#498) `load_settings_if_exists()` is called per-helper (footer, watchdog, progress, auto_continue, preamble, budget) on every handle_message — fires 4–6 times per processed message by design (#269 hot-reload). INFO level floods structlog at ~80 events per session, triggering monitor `config_loaded_burst` alerts even though the underlying behaviour is correct. Demote to DEBUG. The reload behaviour is preserved (config edits still apply on the next run without restart). The proper fix — caching settings within handle_message to do one parse instead of N — is deferred to v0.35.4 (#506) since it touches helper signatures and is out of bug-fix-rc11 scope. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(runner): break stdout read after CompletedEvent in base runner (#505) Mirrors Claude's override (added during #502). Without the break, any non-Claude engine subprocess that emits its terminal event AND has a child inheriting the stdout fd (MCP server, backgrounded shell) blocks on iter_json_lines waiting for an EOF that never comes; proc.wait() is then never reached and the task group hangs. Per-engine audit (codex/opencode/pi/gemini/amp) confirms each emits exactly one terminal event with no post-completion events, so the unconditional break is safe. Closes #505. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(claude): re-emit ExitPlanMode plan body + dead-wakeup idle shortcut (#508, #507) #508 — Plan-mode research/audit runs no longer surface a short final Telegram message that just points to a plan file. Capture the ExitPlanMode plan body from tool_use.input.plan onto the new ClaudeStreamState.last_exitplanmode_plan field; the bridge re-emits it in the final answer when the post-approval result doesn't already contain it. Live impact: 5m30s scout-project research run on staging v0.35.3rc10 produced a 584-char brief acknowledgement instead of the substantive findings. #507 — ScheduleWakeup outside /loop dynamic mode no longer holds the session alive indefinitely. New parallel state.live_wakeups_arm_delay captures the original delaySeconds at arm time; _post_result_idle_watchdog cuts its effective timeout to min(timeout_s, max_armed_delay + 60s) when a wakeup is armed AND _loop_enabled_for_chat is False. Live impact: session 845cfcc3-… sat post-result idle for 58 minutes before manual /cancel. Per CLAUDE.md (testing-conventions.md): 4 new tests in test_claude_runner.py — capture, ignore-empty (#508), dead-wakeup shortcut, /loop preserves default (#507). Refs #507, #508. The bridge re-emit and preamble revisions for #508 ship in the next commit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(bridge): preamble plan-mode clauses + ExitPlanMode plan-body re-emit (#508) Layer A — _DEFAULT_PREAMBLE gains a Plan-mode requirements section: - (A1) ExitPlanMode plan parameter MUST contain a 3-5 bullet substantive summary, never just a file path - (A2) post-approval next assistant message MUST repeat the substantive findings (plan-body messages disappear after approval) - (A3) ### Plan/Document Created bullet asks for inline key findings, not just a path pointer Layer E — replace the dead-code _outline_prefix matcher in handle_message with the new _prepend_exitplanmode_plan helper that prepends the plan body (captured in state.last_exitplanmode_plan) with a 📋 Plan (approved): header + separator when the post-approval final answer doesn't already contain it. Substring-only gate (no length threshold — live repro had answer_len=584). 8 new tests in tests/test_preamble.py: A1/A2/A3 clauses present, plus 5 _prepend_exitplanmode_plan cases (short final, substring-skip, no-plan, empty, None final). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs(changelog): rc11 entries for #505, #507, #508 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore: staging 0.35.3rc11 Bumps version to 0.35.3rc11 for TestPyPI staging release. This rc bundles three monitor/bridge fixes already on this branch: - #505: base runner _iter_jsonl_events breaks loop after CompletedEvent - #507: dead ScheduleWakeup outside /loop no longer holds session - #508: ExitPlanMode plan body re-emit + preamble plan-mode clauses Local CI mirror: ruff format/check clean, 2644 tests passing, build + twine check PASSED on both sdist and wheel. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(tests): satisfy PERF401 in test_base_iter_jsonl_breaks_on_did_emit_completed CI ruff check failed on the new #505 regression test — the local pre-flight only ran `ruff check src/` whereas CI runs the whole repo. Replaces the explicit append loop with an async-comprehension list initialiser, keeping `anyio.fail_after(2.0)` wrapping the iteration. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(rc12) (#511) Closes the cross-chat plan-body leak observed on staging v0.35.3rc11. Moves the #508 _prepend_exitplanmode_plan from the racy bridge read of runner.current_stream to the per-stream StreamResultMessage translation path in claude.py. Three new regression tests cover the per-stream prepend, concurrent-state isolation, and error-path skip. Live smoke on @untether_dev_bot confirmed #508 UX preserved.
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 4.35.3 to 4.35.4. - [Release notes](https://github.com/github/codeql-action/releases) - [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md) - [Commits](github/codeql-action@e46ed2c...68bde55) --- updated-dependencies: - dependency-name: github/codeql-action dependency-version: 4.35.4 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…516) Closes the rc11/rc12 over-correction on #508 that produced 25k–42k char (~8–12 Telegram message) finals on staging plan-mode research/audit runs. User report (Nathan, 2026-05-12): "I had a summary from Claude Code yesterday which was 11 Telegram messages long!! What I really want back is to have Claude Code provide summaries like we have here in command line — summaries of plans (not the entire plan), summaries of recommendations and/or findings and/or next steps (where relevant)." Three stacked over-shoots in rc11/rc12: 1. A1 preamble: "expand the bullets into a substantive summary" for research/audit → plan body ballooned to 2–5k chars. 2. A2 preamble: "your next assistant message ... MUST repeat the substantive findings" → post-approval text ballooned to 0.5–2k chars AND was paraphrased rather than literal-copied. 3. Layer E: substring-skip rule (body in final_answer) failed on every paraphrased run, so the plan body was unconditionally concatenated in front of the post-approval text. Evidence from `journalctl --user -u untether.service` (last 48h on staging @hetz_lba1_bot v0.35.3rc12): aushistory finals at 14k / 16k / 28k / 35k / 42k chars; scout finals at 26k / 27k chars. The 42k case matches the 11-message user repro. Telegram MCP `search_messages` for the literal "📋 Plan (approved):" returned hits on every recent plan-mode completion in both chats — confirming Layer E was the load-bearing over-firer. rc13 retuning: - A1 → "concise 3–5 bullet summary; plan is shown for approval, not as the final deliverable" (drops the substantive-expansion license). - A2 → "brief CLI-style summary, 3–7 bullets or 1–2 short paragraphs, ~500–1500 chars, do NOT re-paste the full plan content". - A3 (## Summary Plan/Document Created bullet) → "Path AND a 3–5 bullet headline summary, not a re-paste of the full content". Note: A3 affects the ## Summary block on ALL completed work, not just plan-mode runs — intentional, matches user's stated goal. - _prepend_exitplanmode_plan: substring check replaced with a length gate (`len(final_answer) < 600`). Substring check stays as a cheap belt-and-braces second skip. Plan body is capped at 1500 chars + truncation marker so a runaway body can't ship 30k chars even when Layer E does fire (preserves original #508 UX for genuinely empty post-approval results without re-introducing concatenation). Live verification on @untether_dev_bot (test chat -5284581592): - Primed test (with "keep it short" instruction): answer_len=882 chars (~1 Telegram message), no "📋 Plan (approved):" literal. - Unprimed test (default research-task prompt): answer_len=1019 chars — preamble is doing its job without user help. Layer E correctly skipped (1019 > 600). Quality verified: 3 substantive bullets + ## Summary block with Completed / Next Steps. The original #508 fallback path (Claude exits with very short post- approval text → Layer E fires with capped plan body) is unit-tested only; not live-verified because the new preamble makes it almost impossible to repro intentionally. Tests: 7 new/updated in tests/test_preamble.py (regression-locks the rc11 verbosity-driving phrases out of _DEFAULT_PREAMBLE, plus length-gate / body-cap / substring-skip cases) and 2 in tests/test_claude_runner.py (`test_translate_result_skips_prepend_ when_answer_substantive`, `test_translate_result_caps_long_plan_body_ when_prepending`). Full suite: 2652 passed, 2 skipped, 82.38% coverage. ruff format + check clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Updates the requirements on [uv-build](https://github.com/astral-sh/uv) to permit the latest version. - [Release notes](https://github.com/astral-sh/uv/releases) - [Changelog](https://github.com/astral-sh/uv/blob/main/CHANGELOG.md) - [Commits](astral-sh/uv@0.9.18...0.11.13) --- updated-dependencies: - dependency-name: uv-build dependency-version: 0.11.13 dependency-type: direct:development ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Add "From the blog" section between Acknowledgements and Licence with links to two littlebearapps.com posts directly related to Untether (Coding from the park, Dogfooding bugs tests can't find). Uses raw HTML anchors with target="_blank" rel="noopener noreferrer" so links open in a new tab on GitHub (PyPI may strip target — acceptable). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
<a target="_blank" rel="noopener noreferrer">so links open in a new tab on GitHub (PyPI's readme_renderer may striptarget— acceptable)Minor SEO/referral win for littlebearapps.com — README is mirrored on PyPI (dofollow there), drives referral traffic, and increases AI-citation surface.
Test plan
🤖 Generated with Claude Code