Initial feature set#1
Merged
Merged
Conversation
Bring in the standalone TypeScript scaffold (three typed seams + skeleton router + CLI + Biome/tsconfig/CI) as the project baseline. `npm run check` and `npm run build` are green. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
- vitest + @vitest/coverage-v8 (globals, node env, v8 coverage) - vitest.config.ts (offline unit tests) and vitest.live.config.ts (SBX_LIVE-style lane via *.live.test.ts, passWithNoTests, excluded from CI) - tsconfig types += vitest/globals so test files type-check under tsc --noEmit - scripts: test / test:watch / test:cov / test:live; check now runs biome --error-on-warnings + tsc --noEmit + vitest run - trivial wiring test; build still excludes *.test.ts from dist/ Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/sandbox-name.ts — name = "t-<channel>-<thread_ts>", a pure function of the thread id (spec §4.1) so the router finds/reverses sandboxes with no persisted mapping. Reconciles a spec bug: §4.1's sample replaces "."->"_", but sbx v0.31.1 `--name` rejects "_" (allows [A-Za-z0-9.+-]); periods are legal, so we keep the ".". A charset test guards against reintroducing the "_". Includes a charset/length hash fallback (t-<16 hex>) and parseSandboxName that namespaces our sandboxes away from utility ones (_login-tmp -> null). 15 tests: round-trip, injectivity, charset, hash fallback, namespacing. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/state-machine.ts — total reducer (state, event) -> {state,
effects} over idle | running | runningPending (spec §4.1). No payload:
Slack is the queue, so mid-turn mentions only flip pending and the
dispatcher recomputes the delta covering all of them.
Coalescing falls out: N mentions during a turn produce exactly ONE
follow-up dispatch. 26 tests cover every state x event cell, the
coalescing count, ack-before-dispatch ordering, and a no-double-dispatch
invariant over an interleaved sequence.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/dedupe.ts — keyed by the trigger's Slack `ts`, sized to Slack's ~5-min retry window (§4.1). In-memory transient state; clock is injected so eviction is testable without sleeping. 7 tests: presence within window, exact-boundary eviction, no pre-TTL eviction, size(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/agent-state.ts — pure layer of the per-thread durable state (spec §4.2/§4.6): highWaterMark (lexicographic Slack-ts ordering), computeDelta (after-hwm, minus bot posts), JSONL serialize/parse, and the ~/.agent-state path constants. Documents the §4.2 "minus the trigger" reconciliation: the previous trigger is already <= hwm (in the ledger) and the current trigger carries the request and must be fed (§4.1 "the delta covers all of them"), so the only author dropped is the bot itself. 13 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/log.ts — format-agnostic boundary logging (spec §4.9): a common envelope with in/out/router directions and turnId correlation; pure builders (turn.in, turn.out.chunk, turn.exit, router.*); summary vs verbose levels that cap/hash large prompts and argv so whole-thread primers and tool-output bursts can't flood the log. Sinks are best-effort and never throw — a full disk or slow shipper must never stall a turn: createJsonlSink (injected writer), createRecordingSink (tests), nullSink, and a safeWrite wrapper. 17 tests, incl. a guard that no credential-shaped field is ever emitted (§4.7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/sandbox/sbx-argv.ts — argv construction for the sbx CLI (verified vs v0.31.1, spec §4.4/§4.5), isolated from the I/O shell so it's testable without the daemon. Bakes in the load-bearing rules: --clone on create (never the host tree); exec wrapped in `bash -c` with `--` and NO `-i` (§4.5/§9.15); a PTY wrap option for codex's headless empty-output regression (§9.14). Includes ls --json parsing (shape confirmed live), cp/stop/rm, and secret ls/set/rm for onboarding. shellQuote/shellJoin use POSIX single-quoting; a test round-trips a hostile argv table (newlines, $(), backticks, quotes, emoji) through real bash. 18 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/backend/codex.ts — turnArgs/parseResult/parseSessionId/captureSessionId for codex 0.135.0 (spec §4.5/§4.9). All pure off captured bytes except captureSessionId, whose one shell call is injected. - turnArgs sets sandbox_mode + approval_policy via `-c` (uniform across `exec` and `exec resume`, since resume lacks `-s`); no --ephemeral, no -i; `--` guards the prompt. - parseResult tolerates several --json event shapes + non-JSON lines, and treats empty output as a failed turn (the §9.14 headless regression). - parseSessionId reads the id straight from the turn-1 stream; captureSessionId is the rollout-file fallback (newest by mtime, no shell pipeline; $HOME expands in the raw command). Extends CodingBackend with optional parseSessionId. Fixtures are best-guess shapes, replaced with real bytes in the live pass (C8.5). 32 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/agent-state-store.ts — reads/writes the ~/.agent-state ledger, session id, and thread reverse-map inside each thread's sandbox (spec §4.6). The router keeps no state of its own. Depends on a narrow SandboxFsLike (homeDir/readFile/writeFile), not the full provider, so it fakes trivially; appendDelta is write-after-success read-modify-write with a single writer (§4.1/§4.2). 9 tests over an in-memory fake fs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/sandbox/sbx-provider.ts — spawns `sbx <argv>` and drains; all argv
lives in sbx-argv.ts. exec drains stdout+stderr to exit, closes stdin
(§9.15), and streams chunks to an onChunk callback for boundary logging
(§4.9). Adds execShell (raw shell), list (parses the {sandboxes:[…]}
wrapper confirmed live), getFile/putFile via cp+host-tempfile, and
homeDir/readFile/writeFile that structurally satisfy SandboxFsLike (no
upward imports).
Extends the SandboxProvider interface with execShell + per-call streaming
options. 14 tests with an injected fake spawn, incl. a virtual sandbox FS
that round-trips get/putFile through real temp files.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/dispatcher.ts — ties the three seams together for a single turn (spec §4.1/§4.5/§4.9): resolve-or-create sandbox by deterministic name, turn-1-vs-resume by the session file (§4.5), whole-thread primer vs post-hwm delta (§4.2), exec with a stdout+stderr-draining reader that logs at the boundary (§4.9), parseResult, then write-after-success — session persisted BEFORE the reply, transcript appended AFTER it (§4.6) — and a 👀→✅/❌ swap. dispatchTurn never rejects so the router's coalescing always advances. This is the logical stand-in for the un-runnable E2E: 13 tests over fakes for all seams cover turn-1 ordering, sandbox reuse, session-id capture + fallback, follow-up resume + delta boundary, failure (no persist, re-feed), empty-delta skip, abandon-on-throw, and verbose chunk logging. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/router.ts — the plumbing (spec §4.1): ts-dedupe, the 👀 ack on every mention, the idle/running/pending FSM, and a per-thread mutex so the state read-modify-write is atomic. Dispatch is fire-and-forget so the turn's reader outlives it and the router is free for other threads; on completion the FSM advances, coalescing mid-turn mentions into one follow-up. All maps are memory-only (reset on restart by design). Replaces the skeleton with a barrel re-exporting Router + Dispatcher. 8 tests with a controllable fake dispatcher: dedupe, pending-no-dispatch, 👀-every-mention, coalescing to one follow-up, finish-boundary atomicity, restart. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/platform/slack-map.ts (pure, tested) — event→Mention and message→ThreadMessage: a root mention uses its own ts as the thread id, a reply uses the parent thread_ts, and bot/own posts never trigger (§4.1). src/platform/slack.ts (I/O shell) — PlatformAdapter over @slack/socket-mode + @slack/web-api: ACK the envelope immediately on receipt and run the handler async, drop retries (retry_num>0) to avoid double-dispatch (§4.1); fetchThread paginates conversations.replies with oldest=sinceTs for the exact delta; reactions are idempotent (ignore already_reacted/no_reaction). 10 mapping tests; the shell is covered by typecheck + build and verified live (C17). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/cli/onboarding.ts — router-guided onboarding (§4.8): pure detectMissingCredential/credentialService/onboardingInstructions + ensureOnboarded over an injected `sbx secret ls`. Defaults Codex to the OAuth/browser path (the chosen flow). src/cli/args.ts — pure parseArgs + USAGE (env documented). src/cli/index.ts — wires SbxProvider + CodexBackend + SlackAdapter + Dispatcher + Router behind the onboarding gate, resolves the bot user id (whoAmI) before constructing the dispatcher so bot posts are filtered, and opens Socket Mode. Boundary logs go to stdout + a JSONL file in the data dir. Adds SbxProvider.secretLs. Verified live against real sbx: `-h` prints usage; with no credential the gate runs `sbx secret ls`, detects the missing openai secret, and prints the OAuth steps. 10 tests for args + onboarding. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
AGENTS.md — the rules agents follow in this repo: router-is-plumbing (not an LLM), the build/check gate, ESM import rules, the three-seams invariants (transport≠interpretation, no central store, pure-core vs I/O-shell), the sandbox-naming no-underscore rule, boundary-logging discipline, the router lifecycle (immediate ACK, reactions-not-replies, coalescing, reader outlives dispatch, write-after-success), Codex/exec gotchas, test/commit conventions, and the §6 "don't relitigate" + §7 out-of-scope lists. CLAUDE.md points to AGENTS.md. README rewritten for the implemented system (architecture, prerequisites, run instructions, env, commands, layout). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/router/idle-sweep.ts — periodic sweep (spec §4.6): sbx ls running sandboxes, reverse each name to its thread (deterministic name, else the ~/.agent-state/thread reverse-map), read last-message time from Slack, and sbx stop the idle ones. Pure selectIdle/tsToMs/parseThreadRef + best-effort I/O; unknown activity is never evicted and foreign sandboxes (_login-tmp) are never touched. The router never auto-rm's. Wired into the CLI to sweep hourly (24h idle window). CI already runs the full Vitest suite via `npm run check`, so no workflow change is needed. 7 tests incl. hash-name reverse lookup. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
src/sandbox/sbx-provider.live.test.ts — SBX_LIVE-gated, excluded from CI;
creates a credential-free `shell` sandbox and exercises the real
SbxProvider: exec via agent-argv + bash -c (§9.9), homeDir ($HOME),
writeFile/readFile (+ missing→null), getFile/putFile cp round-trip, and
lossless stop→exec auto-restart (§4.6). 6/6 pass on sbx v0.31.1 (~25s).
Also confirmed live: sbx ls --json shape is {"sandbox es":[…]} (parseLsJson
already handles it); default network policy must be `balanced` (§9.3); in-VM
$HOME is /home/agent. `npm run test:live` now sets SBX_LIVE=1.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Live verification against codex 0.130.0 in an sbx microVM surfaced two findings, now fixed and committed: 1. §9.15 stdin hang (would have hung EVERY turn): `sbx exec` keeps the VM process's stdin open even when the host closes its end, so `codex exec` blocks on "Reading additional input from stdin..." forever. Closing host stdin is necessary but not sufficient — execArgv now redirects the in-VM command's stdin from /dev/null (`<cmd> < /dev/null`), default on, off under pty. This is exactly the failure boundary logging is meant to catch. 2. Real --json schema confirmed: thread.started carries thread_id (=session id; parseSessionId verified live), plus turn.started/error/turn.failed. parseResult now extracts the codex error message so the dispatcher relays it (e.g. a usage limit) instead of a generic failure. codex-error.jsonl is now REAL captured bytes. Also documented (AGENTS.md/codex.ts): bash -c PATH works (§9.9); the --clone workspace is empty under exec (repo is RO at /run/sandbox/source) so provisioning must clone it (§4.3); subscription hit its usage cap (§9.7), so the success-path fixture + session-resume capture await an API key / reset. 163 tests pass. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
First real Slack mention drove the full pipeline (mention.received → 👀 →
dispatch) but the turn was abandoned by a bug the boundary log caught:
`sbx ls --json` prepends a human-readable "Starting sandboxd daemon..."
line to stdout when it auto-starts the daemon, so JSON.parse threw
"Unexpected token 'S'". parseLsJson now parses from the first {/[ and never
throws (-> [] on no JSON). homeDir defensively takes the last stdout line
too. 165 tests (+2). Verified live: mention reached the router and 👀'd.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Second mention got past parseLsJson to sandbox.create, which then failed:
sbx uses the name as the container HOSTNAME, and "t-...-1780269886.030339"
fails Docker's hostname validation ("value must be a valid hostname").
Isolated live: the DOT is rejected (the case is fine). So neither '.' (spec
§4.1's intent) nor '_' (its sample) is usable.
Encode thread_ts's '.' as '-': name = t-<channel>-<secs>-<micros>, charset
[A-Za-z0-9-], still injective + reversible (channel has no '-'; ts is two
numeric groups). isCharsetSafe now rejects both '_' and '.'. Verified live:
`sbx create --name t-C0B77337UGM-1780269886-030339` succeeds. 165 tests.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Third live mention created the sandbox + ran codex, which exited "Not inside a trusted directory" — codex ran in the empty /home/agent/workspace (the --clone source is a read-only mount at /run/sandbox/source). Fix: the dispatcher now clones the source into $HOME/repo (idempotent), seeds a non-base branch (slack/work), and runs the agent there via a new exec cwd (-w). turnArgs adds --skip-git-repo-check so codex never blocks headlessly. The §9.15 stdin fix is confirmed working (codex no longer hangs). 167 tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Restructure around how to use the bot — what you can do (mention → code → reply in-thread, iterate, run in parallel, resume after idle, ship a PR, grab a file), what the 👀/✅/❌ reactions mean, and a step-by-step setup (sbx + credentials + Slack app + run). The development section (architecture, commands, layout, link to AGENTS.md) is kept below it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…d Slack setup - Remove the "Grab a file" bullet: artifact upload is Phase 2 (the dispatcher does not handle file requests yet), so it was overpromising. - Rewrite the Slack-app setup as numbered, followable steps with the in-app navigation, for people new to Slack apps. Keeps our env-var names (APP_SLACK_APP_TOKEN / APP_SLACK_BOT_TOKEN) and the scopes/events the bot actually uses — including reactions:write and only the app_mention event subscription (the adapter ignores other events). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Reviewer flagged the Docker Desktop dependency. Verified empirically with Docker Desktop fully shut down (CLI unreachable, process gone): `sbx create` and `sbx exec` both work — sbx runs its own bundled container runtime (~/.sbx/run/d/docker.sock) and drives the microVM via the macOS hypervisor directly. Updated the README requirements + sbx setup, and the test:live notes in README/AGENTS.md, to drop the Docker Desktop dependency. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Lift the SandboxShellExecutor interface out of codex.ts into backend/index.ts so a second backend (Claude Code, next) can depend on it without importing codex.ts — keeping the two backend modules decoupled. Pure move: no behavior change. codex.ts drops the now-unused ExecResult import; codex.test.ts imports the type from ./index.js. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Hand-authored sample bytes for the upcoming Claude backend parser, mirroring the codex fixtures: a successful stream-json turn, an errored turn, the 0-byte empty-output case, and sample session-file find/filename output. Documented placeholders to be recaptured live against a real sbx `claude` sandbox. Inert for now (no parser references them yet); the fixtures README gains a per-backend structure. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ionId)
Implements CodingBackend for Anthropic's Claude Code CLI, mirroring codex.ts:
pure parsers off the captured `claude -p --output-format stream-json --verbose`
bytes plus one injected shell call for the session-id fallback. Not wired into
the CLI yet (next commit), so the app still runs codex-only.
- claudeTurnArgs: `claude -p --output-format stream-json --verbose
--dangerously-skip-permissions [--resume ID] -- PROMPT`. --verbose is required
with stream-json in print mode; the prompt is a positional (stdin is /dev/null'd,
§9.15) guarded by `--`; --dangerously-skip-permissions is the non-blocking analog
of codex's approval_policy=never (the microVM is the isolation boundary).
- parseClaudeResult: the authoritative `{"type":"result",result,is_error}` event,
falling back to the last assistant text; empty output -> failed turn (codex §9.14
parity); is_error relays the error text/subtype with ok=false.
- parseClaudeSessionId: the session_id every stream line carries (init is first).
- captureSessionId fallback: newest ~/.claude/projects/**/<session-id>.jsonl.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The composition root already computes `agent` from SCA_AGENT and threads it into sbx create + the onboarding gate; the only remaining hardcode was CodexBackend. Construct ClaudeBackend vs CodexBackend off `agent`. The provider structurally satisfies SandboxShellExecutor for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Verified against Claude Code 2.1.141 in an sbx `claude` microVM (sbx v0.31.1, image docker/sandbox-templates:claude-code-docker): - the backend argv parses (`-- PROMPT` is the prompt; the run reaches the API); `--output-format stream-json` errors without `--verbose`; - `--resume` reuses the session id unless `--fork-session` (capture-once holds); - `claude` resolves on the non-login `bash -c` PATH the provider uses; - `is_error:true` can ship with `subtype:"success"` — parseClaudeResult keys on is_error, confirmed against the real bytes; - captureSessionId's find command returns the real ~/.claude/projects/<slug>/<session-id>.jsonl transcript. claude-error.jsonl is now REAL captured bytes (an auth-failure turn). The error test asserts the relayed message + the session id from those bytes. The success fixture stays a placeholder pending an `anthropic` credential to capture live. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The bot already prints a setup guide and exits when the agent's credential is missing; this makes the Claude guide a single command. `sbx run claude` auto-creates a login sandbox and attaches (verified: `sbx run` "creates the sandbox if it does not already exist") — drop the separate `sbx create _login-tmp` + `sbx run` + `sbx rm` dance. The credential is captured host-globally, so the throwaway sandbox can be removed afterward. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…-end End-to-end against Claude Code 2.1.141 in a --clone sbx claude sandbox with a provisioned repo (the dispatcher's exact flow): - turn 1 (real backend argv) created a file via the Write tool with --dangerously-skip-permissions and NO approval prompt; result is_error:false, result:"done", non-empty stdout headless; - turn 2 `claude -p --resume <id>` recalled a planted codeword from session history — the SAME session id across both turns (no fork), confirming the capture-once/resume design live. claude-stream.jsonl is now REAL captured bytes (init → rate_limit_event → assistant thinking → tool_use → tool_result → assistant text → result), so the parser is verified against a real success AND error stream (the success stream exercises the thinking/tool_use skip). Aligned the session-file fixtures to the captured id; tests updated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Rebrand from "Slack Coding Agent" to Mishu — a coding agent that lives in your team's chat (Slack today). Renames the package + bin name, the CLI usage banner, the [mishu] log prefix, and the SCA_* env vars → MISHU_* (SCA literally stood for "Slack Coding Agent"). The README/AGENTS intros are reframed to the platform-agnostic positioning and now mention "Codex or Claude Code". Genuinely Slack-specific names (SlackAdapter, the APP_SLACK_* tokens, the Slack app setup) and the design-spec attachment path are intentionally unchanged. BREAKING: runtime env vars are now MISHU_REPO / MISHU_AGENT / MISHU_LOG_LEVEL / MISHU_BOT_USER (were SCA_*). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the `.context/attachments/.../slack-coding-agent-spec.md` path from README and AGENTS (a gitignored attachment carrying the old name). AGENTS keeps a one-line note that § refs point to the design spec; README's Status section keeps the roadmap without the path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
One command checks the selected agent's sbx credential and, if it's missing, runs the interactive auth flow with the terminal attached — browser OAuth for Codex (`sbx secret set -g openai --oauth`), in-sandbox `/login` for Claude (`sbx run claude`) — then confirms. Idempotent. The pure decision lives in onboarding.ts (`setupCommand`); src/cli/setup.ts is a thin I/O shell reusing SbxProvider.secretLs + detectMissingCredential. The bot's onboarding gate and the README now point to `npm run setup`. Runs via tsx, so no build needed. Verified live: the idempotent "already configured" path for both codex and claude. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`npm run setup` now resolves the agent per spec §4.8: MISHU_AGENT if set, else an interactive [codex/claude] prompt on a TTY, else codex (so piped/CI runs never block on a prompt). Adds a pure parseAgent() helper, shared with the composition root (replacing its inline ternary) + a unit test. Also guards the interactive auth flow: when the credential is missing and stdin isn't a TTY, print a clear "run it in a terminal" message and exit instead of launching the browser / `sbx run claude` flow headlessly (which hangs). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude auth logs in via a sandbox; name it deterministically (`sbx run --name mishu-login claude`) so setup can `sbx rm --force` it once the credential is captured — no more stray login sandbox. Best-effort: a failed cleanup just prints the manual command, never fails setup. The manual onboarding instructions use the same named sandbox + an explicit cleanup step. Adds loginSandbox() + tests. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The Claude backend duplicated, byte-for-byte, four generic helpers that already lived in codex.ts: the `isRecord`/`asString` JSON type guards, the JSONL-stream event parser, and the "newest path by mtime from `find -printf` output" selector. None carry agent-format knowledge, so hoist them into a new pure `src/backend/parsing.ts` (with colocated tests) and have both backends import them — finishing the sharing the diff already began for `SandboxShellExecutor`. Each backend keeps all of its agent-specific format knowledge (argv shapes, result/event interpretation, the session-id field + transcript/rollout filename regexes, and the `find` path+glob constants), preserving the §3 invariant that agent CLI/output format lives only in the backend file. Also tighten both filename-regex guards to `match?.[1] ?? null`. Net −130 lines across the two backends + their tests, replaced by one shared 54-line module. `npm run check` green (190 tests). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
`hashSandboxName` separated the channel from the thread_ts with a literal
NUL (0x00) instead of a space — invisible in editors, and enough to make git
classify the whole `.ts` as binary (no line diffs, blame, or textual merges).
Replace it with a real space, matching the `${channel} ${threadTs}` separator
the idle-sweep/dispatcher reverse-map already uses.
No behavior tests pin a hash value; determinism, charset-safety, and the
cross-boundary collision guard hold for any separator outside the channel
charset, so the suite is unchanged (190 tests green).
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
$HOME is invariant for a sandbox's lifetime, but `AgentStateStore.abs()` re-resolved it via a full `sbx exec` round-trip on every state read/write — about five per turn (readSessionId, the dispatcher's direct call, writeSessionId, and appendDelta's two). Memoize it per sandbox in the provider, the single place that pays the exec, so every caller benefits with no signature or behavior change; the first lookup pays the round-trip and the rest are Map hits. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a version-controlled pre-commit hook wired via core.hooksPath (set by the `prepare` script on `npm install`, so no extra dependency). It runs the single gate — Biome + tsc + Vitest — before every commit; bypass with `--no-verify`. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Soften the "outer bot" guidance to out-of-scope-by-default, open to a strong use case (was framed as a prototype mistake the design removed). - Remove the design-spec § section references (the spec isn't committed). - Drop the "Don't relitigate / out of scope for v0" section. - Document the new pre-commit hook under Build / check / test. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The design spec won't be committed, so strip its § section citations from comments and test names across the backend, router, sandbox, platform, and cli modules. Comment/string-only — no logic change; `npm run check` stays green. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Add a driving-philosophy note (inspired by pi-mom and Shopify's River; cites Tobi Lütke's post on working in the open) and remove the trailing Status section. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Tighten router and sandbox failure handling, simplify docs and comments, add OSS metadata, and clean up the test suite with more meaningful coverage. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wire MISHU_SANDBOX_TEMPLATE into sbx create options, normalize empty values, and document sandbox governance as operator-managed. Co-Authored-By: Codex <codex@openai.com>
Add the README image asset and include assets in package files so the referenced image is available when published. Co-Authored-By: Codex <codex@openai.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.