Karpathy-style autonomous experiment loop. An agent makes a change, a metric script scores it, the loop keeps or discards via git, logs to a journal, and repeats.
# 1. Install the experiment skill
poe-code experiment install
# 2. Create an experiment doc using /poe-code-experiment-plan
# e.g. "create experiment to optimize test duration"
# 3. Run the loop
poe-code experiment run
# 4. Check results
poe-code experiment journalA markdown file with YAML frontmatter. The body is the agent's research brief.
---
agent: claude-code
metric:
name: tests
direction: maximize
baseline: null
status:
state: open
experiment: 0
kept: 0
---
# Make the test suite faster
Reduce test execution time without removing coverage.
Focus on parallelization and removing unnecessary setup/teardown.Agents cycle round-robin across experiments:
agent:
- claude-code
- codexFrom the CLI: poe-code experiment run --agent claude-code,codex
Use agent:provider/model notation:
agent: claude-code:anthropic/claude-opus-4.7Metric scripts decide what "better" means. Each must exit 0 on success and print a single number to stdout.
Register them as metric:* npm scripts:
{
"scripts": {
"metric:tests": "node scripts/metric-test-count.mjs",
"metric:test_duration": "node scripts/metric-test-duration.mjs"
}
}maximize— higher is better (test count, coverage)minimize— lower is better (duration, bundle size)stable— must not change (test count during optimization)
All metrics must pass, scores are tracked independently:
metric:
- name: tests
direction: maximize
- name: test_duration
direction: minimizemeasure baseline -> loop:
agent makes a change -> commit -> run metrics -> keep or discard -> journal -> repeat
The agent learns from past attempts through the journal — it sees what worked and what didn't.
By default experiment docs are discovered from .poe-code/experiments/. To use a different directory:
# Set plan directory in project config (.poe-code/config.json)
# { "experiment": { "plan_directory": "docs/experiments" } }
# Or via env
POE_EXPERIMENT_PLAN_DIRECTORY=docs/experiments poe-code experiment run
# Or point to a specific doc directly
poe-code experiment run docs/experiments/optimize-tests.mdExperiment runs can use the live terminal dashboard.
# One-off flags
poe-code experiment run --tui
poe-code experiment run --no-tui
# Config default (.poe-code/config.json)
# { "experiment": { "tui": true } }
# Env override
POE_EXPERIMENT_TUI=true poe-code experiment runpoe-code experiment run [doc] [--agent <name>] [--max-experiments <n>] [--tui|--no-tui]
poe-code experiment validate [doc]
poe-code experiment journal [doc]
poe-code experiment install