feat(run-prompt): devcontainer-based verification environment for --loop subagents

## Problem

When running `/run-prompt --worktree --loop`, subagents can build code but can't **run the full stack** to verify changes end-to-end. The `--loop` verification is limited to `make build && make test && make lint` — static checks only. There's no way for a subagent to:

1. Spin up the app (backend + frontend + database + dependent services)
2. Run Playwright or integration tests against the running stack
3. Tear it down when done

This means verification in the loop catches compile errors and unit test failures but misses UI regressions, API integration issues, and anything that requires a running app.

---

## Architecture: Three Layers

The clean shape is **"projects define verification, executors define isolation, runners define agent entrypoints."** This gives real end-to-end loop verification without locking the feature to one agent or one container layout.

### Layer 1: Verification Config (Phase 1 — implement first)

A per-project `.claude/verify.json` that maps changed paths to build/test/lint/integration steps. This **immediately improves `--loop`** even on the host executor.

```json
{
  "version": 1,
  "components": {
    "backend": {
      "paths": ["backend/"],
      "steps": {
        "build": { "cmd": "cd backend && make build", "required": true },
        "test": { "cmd": "cd backend && make test", "required": true },
        "lint": { "cmd": "cd backend && make lint", "required": true }
      }
    },
    "frontend": {
      "paths": ["frontend/"],
      "steps": {
        "build": { "cmd": "cd frontend && npm run build", "required": true },
        "check": { "cmd": "cd frontend && npm run check", "required": true },
        "lint": { "cmd": "cd frontend && npm run lint", "required": true },
        "test": { "cmd": "cd frontend && npm run test:unit", "required": false }
      }
    }
  },
  "integration": {
    "compose_file": "docker-compose.dev.yml",
    "steps": {
      "up": { "cmd": "docker compose -f docker-compose.dev.yml up -d --wait", "required": true },
      "test": { "cmd": "cd frontend && npx playwright test", "required": true },
      "down": { "cmd": "docker compose -f docker-compose.dev.yml down", "always_run": true }
    }
  }
}
```

Key behaviors:
- **Path-based auto-detection**: `git diff --name-only` determines which components changed, only those get verified
- **`required` vs optional**: flaky or service-dependent tests can be optional
- **Integration section**: spins up full stack via compose, runs Playwright, tears down — even on failure (`always_run`)

### Layer 2: Executor Abstraction (Phase 2)

Three executor backends, selectable via `--executor`:

| Executor | Isolation | Use Case |
|----------|-----------|----------|
| `host` | None (current behavior) | Trusted repos, simple projects |
| `sandbox` | Bubblewrap (see #9) | Lightweight Linux sandboxing |
| `devcontainer` | Full container + optional firewall | Full-stack verification, untrusted repos |

The **devcontainer executor** uses the [Dev Container CLI](https://github.com/devcontainers/cli) which is scriptable and doesn't require VS Code. It supports both simple container setups and Docker Compose scenarios.

The `--loop` flow with devcontainer becomes:
1. Create worktree
2. `devcontainer up --workspace-folder <worktree>`
3. Execute subagent inside the container (via `devcontainer exec`)
4. Subagent has full stack running, can build + serve + Playwright test
5. Tear down container on completion

#### Security Model

Modeled after [Anthropic's reference devcontainer](https://docs.anthropic.com/en/docs/claude-code/tutorials#devcontainer-reference-implementation):

- Run as non-root user, bind-mount workspace only
- Persist shell history and agent state in named volumes
- Add only `NET_ADMIN` and `NET_RAW` capabilities
- Init firewall with default-drop policy, allowlist required domains
- No Docker socket exposure to the agent
- Rootless Docker where feasible

**Important caveat**: Anthropic explicitly warns that even their hardened devcontainer does not stop a malicious repo from exfiltrating anything reachable in the container. This approach is recommended **only for trusted repositories**. "Inside Docker" is not the entire threat model — Docker's daemon runs as root unless using Rootless mode, and bind mounts are writable by default.

### Layer 3: Agent Runners (Phase 2)

Claude Code and OpenCode plug into the same executor model as **runner adapters**, rather than being special-cased in `--loop`.

#### Claude Code Runner

| Concern | Approach |
|---------|----------|
| **Required mount** | Repo/worktree. Claude reads project instructions, settings, skills, subagents from project tree and `~/.claude` |
| **Persistent state** | `~/.claude/` for settings/skills/subagents; optionally `~/.claude.json` for OAuth, MCP config, trust state |
| **Stateless mode** | `--bare` + `ANTHROPIC_API_KEY`. Skips OAuth/keychain reads; recommended for scripted/SDK calls |
| **Write restrictions** | Even in `bypassPermissions`, writes to `.git`, `.claude`, `.vscode`, `.idea` still prompt (except `.claude/commands`, `.claude/agents`, `.claude/skills`). Unattended loops must avoid protected targets |

#### OpenCode Runner

| Concern | Approach |
|---------|----------|
| **Required mount** | Repo/worktree + config + credentials |
| **Entrypoints** | `opencode run` (programmatic), `opencode serve` (headless), `opencode run --attach` (reuse backend) |
| **Config paths** | Global: `~/.config/opencode/opencode.json`; Project: `opencode.json` at repo root; Override: `OPENCODE_CONFIG` env var |
| **Credentials** | `~/.local/share/opencode/auth.json` via `opencode auth login`, or env vars, or project `.env` |
| **Shared paths** | OpenCode discovers `AGENTS.md`, `CLAUDE.md`, `.opencode/skills`, `.claude/skills` and their global equivalents — enables shared repo guidance across agents |
| **Permissions** | Config-driven via `permission` block. For unattended runs, set project-local permission policy explicitly |

---

## Benefits

- Subagents catch real bugs, not just compile errors
- No more "it built but the page is broken" merges from worktree prompts
- Projects define their own verification — daplug doesn't need to know app-specific details
- Devcontainer approach gives full isolation per worktree (no port conflicts between parallel runs)
- Architecture supports multiple agents without special-casing each one

## Implementation Plan

### Phase 1: Verification Config + Integration Lifecycle
- [ ] Define `.claude/verify.json` schema
- [ ] Add config discovery to `executor.py` (read verify.json, fall back to current static checks)
- [ ] Implement path-based component detection via `git diff --name-only`
- [ ] Implement integration lifecycle (up → wait → test → down with `always_run` teardown)
- [ ] Add `required` vs optional step handling
- [ ] Create verify.json for youtube_summaries as reference implementation

### Phase 2: Executor Abstraction + Agent Runners
- [ ] Define executor interface (host, sandbox, devcontainer)
- [ ] Add `--executor` flag to run-prompt
- [ ] Implement devcontainer executor using Dev Container CLI
- [ ] Implement Claude Code runner adapter (mount profiles, `--bare` mode)
- [ ] Implement OpenCode runner adapter (config paths, `opencode run`)
- [ ] Pre-built image support and warm reuse (startup cost is the main product risk)
- [ ] Port allocation strategy for parallel worktree devcontainers

## Open Questions

- Should the verify config be JSON, YAML, or just a Makefile convention (`make verify`)?
- Can devcontainer startup be fast enough for iterative loops? Pre-built images and warm reuse should be part of design from day one.
- How to handle projects that need external services (APIs, GPUs, etc.) that can't run in a container?
- Port allocation strategy when running multiple worktree devcontainers in parallel?
- Does this overlap with or complement `--sandbox` (#9)? Current thinking: sandbox = lightweight isolation, devcontainer = full-stack isolation. Complementary.

## Context

Came up working on [youtube_summaries](https://github.com/cruzanstx/youtube_summaries) which has 3 components (Go backend, Go processor, SvelteKit frontend) each with their own Dockerfiles and Makefiles but no unified dev compose or devcontainer.

### References

- [Anthropic reference devcontainer](https://docs.anthropic.com/en/docs/claude-code/tutorials#devcontainer-reference-implementation)
- [Dev Container CLI](https://github.com/devcontainers/cli)
- [OpenCode documentation](https://opencode.ai)
- Issue #9 (bubblewrap sandbox)

Concern	Approach
Required mount	Repo/worktree. Claude reads project instructions, settings, skills, subagents from project tree and `~/.claude`
Persistent state	`~/.claude/` for settings/skills/subagents; optionally `~/.claude.json` for OAuth, MCP config, trust state
Stateless mode	`--bare` + `ANTHROPIC_API_KEY`. Skips OAuth/keychain reads; recommended for scripted/SDK calls
Write restrictions	Even in `bypassPermissions`, writes to `.git`, `.claude`, `.vscode`, `.idea` still prompt (except `.claude/commands`, `.claude/agents`, `.claude/skills`). Unattended loops must avoid protected targets

Concern	Approach
Required mount	Repo/worktree + config + credentials
Entrypoints	`opencode run` (programmatic), `opencode serve` (headless), `opencode run --attach` (reuse backend)
Config paths	Global: `~/.config/opencode/opencode.json`; Project: `opencode.json` at repo root; Override: `OPENCODE_CONFIG` env var
Credentials	`~/.local/share/opencode/auth.json` via `opencode auth login`, or env vars, or project `.env`
Shared paths	OpenCode discovers `AGENTS.md`, `CLAUDE.md`, `.opencode/skills`, `.claude/skills` and their global equivalents — enables shared repo guidance across agents
Permissions	Config-driven via `permission` block. For unattended runs, set project-local permission policy explicitly

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(run-prompt): devcontainer-based verification environment for --loop subagents #11

Problem

Architecture: Three Layers

Layer 1: Verification Config (Phase 1 — implement first)

Layer 2: Executor Abstraction (Phase 2)

Security Model

Layer 3: Agent Runners (Phase 2)

Claude Code Runner

OpenCode Runner

Benefits

Implementation Plan

Phase 1: Verification Config + Integration Lifecycle

Phase 2: Executor Abstraction + Agent Runners

Open Questions

Context

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Executor	Isolation	Use Case
`host`	None (current behavior)	Trusted repos, simple projects
`sandbox`	Bubblewrap (see #9)	Lightweight Linux sandboxing
`devcontainer`	Full container + optional firewall	Full-stack verification, untrusted repos

feat(run-prompt): devcontainer-based verification environment for --loop subagents #11

Description

Problem

Architecture: Three Layers

Layer 1: Verification Config (Phase 1 — implement first)

Layer 2: Executor Abstraction (Phase 2)

Security Model

Layer 3: Agent Runners (Phase 2)

Claude Code Runner

OpenCode Runner

Benefits

Implementation Plan

Phase 1: Verification Config + Integration Lifecycle

Phase 2: Executor Abstraction + Agent Runners

Open Questions

Context

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions