You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When running /run-prompt --worktree --loop, subagents can build code but can't run the full stack to verify changes end-to-end. The --loop verification is limited to make build && make test && make lint — static checks only. There's no way for a subagent to:
Spin up the app (backend + frontend + database + dependent services)
Run Playwright or integration tests against the running stack
Tear it down when done
This means verification in the loop catches compile errors and unit test failures but misses UI regressions, API integration issues, and anything that requires a running app.
Architecture: Three Layers
The clean shape is "projects define verification, executors define isolation, runners define agent entrypoints." This gives real end-to-end loop verification without locking the feature to one agent or one container layout.
A per-project .claude/verify.json that maps changed paths to build/test/lint/integration steps. This immediately improves --loop even on the host executor.
The devcontainer executor uses the Dev Container CLI which is scriptable and doesn't require VS Code. It supports both simple container setups and Docker Compose scenarios.
The --loop flow with devcontainer becomes:
Create worktree
devcontainer up --workspace-folder <worktree>
Execute subagent inside the container (via devcontainer exec)
Subagent has full stack running, can build + serve + Playwright test
Persist shell history and agent state in named volumes
Add only NET_ADMIN and NET_RAW capabilities
Init firewall with default-drop policy, allowlist required domains
No Docker socket exposure to the agent
Rootless Docker where feasible
Important caveat: Anthropic explicitly warns that even their hardened devcontainer does not stop a malicious repo from exfiltrating anything reachable in the container. This approach is recommended only for trusted repositories. "Inside Docker" is not the entire threat model — Docker's daemon runs as root unless using Rootless mode, and bind mounts are writable by default.
Layer 3: Agent Runners (Phase 2)
Claude Code and OpenCode plug into the same executor model as runner adapters, rather than being special-cased in --loop.
Claude Code Runner
Concern
Approach
Required mount
Repo/worktree. Claude reads project instructions, settings, skills, subagents from project tree and ~/.claude
Persistent state
~/.claude/ for settings/skills/subagents; optionally ~/.claude.json for OAuth, MCP config, trust state
Stateless mode
--bare + ANTHROPIC_API_KEY. Skips OAuth/keychain reads; recommended for scripted/SDK calls
Write restrictions
Even in bypassPermissions, writes to .git, .claude, .vscode, .idea still prompt (except .claude/commands, .claude/agents, .claude/skills). Unattended loops must avoid protected targets
OpenCode Runner
Concern
Approach
Required mount
Repo/worktree + config + credentials
Entrypoints
opencode run (programmatic), opencode serve (headless), opencode run --attach (reuse backend)
Config paths
Global: ~/.config/opencode/opencode.json; Project: opencode.json at repo root; Override: OPENCODE_CONFIG env var
Credentials
~/.local/share/opencode/auth.json via opencode auth login, or env vars, or project .env
Shared paths
OpenCode discovers AGENTS.md, CLAUDE.md, .opencode/skills, .claude/skills and their global equivalents — enables shared repo guidance across agents
Permissions
Config-driven via permission block. For unattended runs, set project-local permission policy explicitly
Benefits
Subagents catch real bugs, not just compile errors
No more "it built but the page is broken" merges from worktree prompts
Projects define their own verification — daplug doesn't need to know app-specific details
Devcontainer approach gives full isolation per worktree (no port conflicts between parallel runs)
Architecture supports multiple agents without special-casing each one
Came up working on youtube_summaries which has 3 components (Go backend, Go processor, SvelteKit frontend) each with their own Dockerfiles and Makefiles but no unified dev compose or devcontainer.
Problem
When running
/run-prompt --worktree --loop, subagents can build code but can't run the full stack to verify changes end-to-end. The--loopverification is limited tomake build && make test && make lint— static checks only. There's no way for a subagent to:This means verification in the loop catches compile errors and unit test failures but misses UI regressions, API integration issues, and anything that requires a running app.
Architecture: Three Layers
The clean shape is "projects define verification, executors define isolation, runners define agent entrypoints." This gives real end-to-end loop verification without locking the feature to one agent or one container layout.
Layer 1: Verification Config (Phase 1 — implement first)
A per-project
.claude/verify.jsonthat maps changed paths to build/test/lint/integration steps. This immediately improves--loopeven on the host executor.{ "version": 1, "components": { "backend": { "paths": ["backend/"], "steps": { "build": { "cmd": "cd backend && make build", "required": true }, "test": { "cmd": "cd backend && make test", "required": true }, "lint": { "cmd": "cd backend && make lint", "required": true } } }, "frontend": { "paths": ["frontend/"], "steps": { "build": { "cmd": "cd frontend && npm run build", "required": true }, "check": { "cmd": "cd frontend && npm run check", "required": true }, "lint": { "cmd": "cd frontend && npm run lint", "required": true }, "test": { "cmd": "cd frontend && npm run test:unit", "required": false } } } }, "integration": { "compose_file": "docker-compose.dev.yml", "steps": { "up": { "cmd": "docker compose -f docker-compose.dev.yml up -d --wait", "required": true }, "test": { "cmd": "cd frontend && npx playwright test", "required": true }, "down": { "cmd": "docker compose -f docker-compose.dev.yml down", "always_run": true } } } }Key behaviors:
git diff --name-onlydetermines which components changed, only those get verifiedrequiredvs optional: flaky or service-dependent tests can be optionalalways_run)Layer 2: Executor Abstraction (Phase 2)
Three executor backends, selectable via
--executor:hostsandboxdevcontainerThe devcontainer executor uses the Dev Container CLI which is scriptable and doesn't require VS Code. It supports both simple container setups and Docker Compose scenarios.
The
--loopflow with devcontainer becomes:devcontainer up --workspace-folder <worktree>devcontainer exec)Security Model
Modeled after Anthropic's reference devcontainer:
NET_ADMINandNET_RAWcapabilitiesImportant caveat: Anthropic explicitly warns that even their hardened devcontainer does not stop a malicious repo from exfiltrating anything reachable in the container. This approach is recommended only for trusted repositories. "Inside Docker" is not the entire threat model — Docker's daemon runs as root unless using Rootless mode, and bind mounts are writable by default.
Layer 3: Agent Runners (Phase 2)
Claude Code and OpenCode plug into the same executor model as runner adapters, rather than being special-cased in
--loop.Claude Code Runner
~/.claude~/.claude/for settings/skills/subagents; optionally~/.claude.jsonfor OAuth, MCP config, trust state--bare+ANTHROPIC_API_KEY. Skips OAuth/keychain reads; recommended for scripted/SDK callsbypassPermissions, writes to.git,.claude,.vscode,.ideastill prompt (except.claude/commands,.claude/agents,.claude/skills). Unattended loops must avoid protected targetsOpenCode Runner
opencode run(programmatic),opencode serve(headless),opencode run --attach(reuse backend)~/.config/opencode/opencode.json; Project:opencode.jsonat repo root; Override:OPENCODE_CONFIGenv var~/.local/share/opencode/auth.jsonviaopencode auth login, or env vars, or project.envAGENTS.md,CLAUDE.md,.opencode/skills,.claude/skillsand their global equivalents — enables shared repo guidance across agentspermissionblock. For unattended runs, set project-local permission policy explicitlyBenefits
Implementation Plan
Phase 1: Verification Config + Integration Lifecycle
.claude/verify.jsonschemaexecutor.py(read verify.json, fall back to current static checks)git diff --name-onlyalways_runteardown)requiredvs optional step handlingPhase 2: Executor Abstraction + Agent Runners
--executorflag to run-prompt--baremode)opencode run)Open Questions
make verify)?--sandbox(feat(run-prompt): add --sandbox bubblewrap for Linux prompt execution #9)? Current thinking: sandbox = lightweight isolation, devcontainer = full-stack isolation. Complementary.Context
Came up working on youtube_summaries which has 3 components (Go backend, Go processor, SvelteKit frontend) each with their own Dockerfiles and Makefiles but no unified dev compose or devcontainer.
References