This is a record of driving the full R-P-E-R-S CLI pipeline on a real multi-file task in the RePERS repo itself, using real parallel Agent worker dispatch. Every command shown was executed verbatim; outputs are quoted.
Add a triple-quoted module docstring to each of three META scripts in
.repers/scripts/that lack one:dag_engine.py,state_report.py,snapshot_freshness.py.
Why this task: it fits the harness's stated sweet spot (≥3 files, distinct concerns per lane, genuinely independent — the router would route this to R-P-E-R-S) but is small enough to demo in one walkthrough.
$ python3 .repers/scripts/repers.py init --task e2e-add-docstrings
[+] Created task directory: /home/.../repers/repers_tasks/e2e_add_docstrings
-> Copied standard template: research.md
-> Copied standard template: plan.md
-> Copied standard template: review.md
-> Copied standard template: shipping.mdGotcha #1:
--task e2e-add-docstrings(dashes) becomes directorye2e_add_docstrings(underscores). Worth knowing if you script around it.
$ python3 .repers/scripts/repers.py preflight \
--query "add module docstring stdlib python" --refresh --json
recommendation: extend
capability_hits: 4
top 3 results: source=workspace ...For this task preflight returned "extend" with 4 workspace matches — none of them an existing docstring-addition capability. That's a true result: this is a one-off doc improvement, not a new capability to register.
The init step ships a template. You (or a planner agent) fills it in with steps in the markdown format the parser expects:
## 🛠️ Step-by-Step Execution Roadmap
1. **Step 1: Add module docstring to dag_engine.py**
* **Action**: Insert a triple-quoted module-level docstring ...
* **Target File**: .repers/scripts/dag_engine.py
* **Expected Outcome**: ast.get_docstring(...) returns non-empty.
* **Depends On**: none
* **Status**: Pending
2. **Step 2: ...** (target_file: state_report.py)
3. **Step 3: ...** (target_file: snapshot_freshness.py)
4. **Step 4: Verify all three have docstrings**
* **Verification Command**: python3 -c "... all(ast.get_docstring(...))"
* **Depends On**: 1, 2, 3
* **Status**: PendingDesign discovery:
infer_modeswitches each step's mode by whether it has averification_command. With one →local(supervisor runs it). Without one →subagent(dispatch to external worker). So steps 1-3 are deliberately written WITHOUT a verification_command (each worker does the edit), and step 4 has the verification_command (supervisor verifies all 3 after they finish).
$ python3 .repers/scripts/repers.py plan --task e2e-add-docstrings --json
steps: 4
[1] mode=subagent deps=[] target=['.repers/scripts/dag_engine.py']
[2] mode=subagent deps=[] target=['.repers/scripts/state_report.py']
[3] mode=subagent deps=[] target=['.repers/scripts/snapshot_freshness.py']
[4] mode=local deps=['1', '2', '3'] target=['repers_tasks/.../verify.txt']Cycle detection + missing-dep detection happen here; both passed.
$ python3 .repers/scripts/repers.py dispatch \
--task e2e-add-docstrings --max-workers 3 --json
ready_count: 3 / batches: 1
worker=batch1-step1 step=1 target=['.repers/scripts/dag_engine.py']
worker=batch1-step2 step=2 target=['.repers/scripts/state_report.py']
worker=batch1-step3 step=3 target=['.repers/scripts/snapshot_freshness.py']Three subagent steps, no overlapping target_files → one batch of three. The
manifest lands at repers_tasks/.../dispatch/manifest.json. This is the
contract you hand to your external agent runtime (LangChain, CrewAI, raw
Claude/Codex/Gemini SDK calls, your own dispatcher) to actually execute.
I dispatched 3 general-purpose Agent calls in a single message — 3
genuine parallel processes. Each got:
- Path to the dispatch manifest entry for its step
- The Action text + target file
- Instructions to write a
step_result_v1artifact toresults/step-N.json - A verification command to self-check before declaring
completed
All 3 completed in ~30s wall-clock each (max parallel wait ≈ 35s). Each
wrote repers_tasks/.../results/step-N.json with the contract schema.
$ python3 .repers/scripts/repers.py review \
--task e2e-add-docstrings --update-status --json
{
"schema": "repers.review.v1",
"result_count": 3,
"ok_count": 3,
"failed_count": 0,
"ok": true
}All 3 result files passed schema validation
(schema == repers.step_result.v1, required keys present,
status/returncode consistent). plan.md statuses updated to Completed
for steps 1-3.
$ # IMPORTANT: re-run `plan --json` first so plan.json picks up the
$ # newly-Completed statuses from step 7; --use-existing-plan reads
$ # plan.json verbatim.
$ python3 .repers/scripts/repers.py plan --task e2e-add-docstrings --json > /dev/null
$ python3 .repers/scripts/repers.py run --task e2e-add-docstrings \
--action local --use-existing-plan --json
completed: ['4'] | failed: []
step 4 status=completed rc=0
stdout: okGotcha #2:
--use-existing-planreads plan.json verbatim. Afterreview --update-statusrewrites plan.md statuses, you must re-runplan --jsononce before the nextrun/dispatchto refresh plan.json.
$ python3 .repers/scripts/repers.py shipping --task e2e-add-docstrings --json
{
"shipping": {
"schema": "repers.shipping.v1",
"summary": {
"ok": false,
"errors": ["doctor check is not ok"],
"warnings": [
"git working tree is dirty",
"RePERS hook is not installed for this workspace",
"optional backends unavailable: openai-agents, langgraph, mcp"
]
},
"artifacts": { "markdown": [...], "json": [...] }
}
}shipping ran and produced the evidence document. The ok: false here
reflects unrelated pre-existing items: doctor's python probe fails on
Linux systems that only have python3 on PATH (tracked in issue #1),
the working tree is intentionally dirty mid-task, and the pre-commit hook
isn't installed in this workspace by design.
$ python3 .repers/scripts/repers.py verify-install --json
ok: False | changed: [3 files with new size/sha256]Design discovery (and important UX note): in-place edits to files under
.repers/scripts/breakverify-installbecause the install manifest pins file hashes. Runrefresh-manifestto update:
$ python3 .repers/scripts/repers.py refresh-manifest --json
refresh ok: True | files: 42
$ python3 .repers/scripts/repers.py verify-install --json
ok: True | changed: []
$ python3 .repers/scripts/repers.py capabilities --action validate --json
ok: True | entries: 24🟢 GREEN — bug fix landed, no regression.
| Stage | Command | Time | Result |
|---|---|---|---|
| 1. init | init --task |
<1 s | ✅ workspace created |
| 2. preflight | preflight --query --refresh --json |
<1 s | ✅ recommendation=extend |
| 3. plan write | (human/planner fills plan.md) | varies | n/a |
| 4. plan parse | plan --json |
<1 s | ✅ 4 steps, DAG valid |
| 5. dispatch | dispatch --max-workers 3 --json |
<1 s | ✅ 3 workers, 1 batch |
| 6. workers (parallel) | 3× Agent tool calls | ~35 s wall | ✅ 3 result artifacts |
| 7. review | review --update-status --json |
<1 s | ✅ 3/3 schema-ok |
| 8. local verify | run --action local --use-existing-plan |
<2 s | ✅ rc=0, stdout=ok |
| 9. shipping | shipping --json |
<1 s | ✅ artifact, expected warnings |
| 10. regression | refresh-manifest + verify-install |
<1 s | ✅ ok=true |
Total wall time: ~45 s (dominated by parallel worker execution). CLI commands actually run: 10 distinct subcommands across the pipeline. Manual steps: just step 3 (fill in plan.md) and step 6 (have an agent runtime pick up the dispatch manifest — here, my Agent tool).
What worked first try:
init→preflight→plan→dispatch→ workers →review→run(local for the supervisor step) →shippingend-to-end produced the documented artifacts at every stage with no surprises.- The
infer_modeheuristic (verification_commandpresent → local, else → subagent) is a clever way to make plan.md serve both modes without extra syntax. It worked exactly as designed once I understood it. - The dispatch manifest schema (
repers.dispatch.v1) is exactly the contract you need to hand to any agent runtime — no hidden assumptions about which LLM picks it up. step_result_v1artifact schema is well-specified enough that a worker can produce one without reading framework source.
Gotchas / sharp edges:
- Task name normalization (
e2e-add-docstrings→e2e_add_docstrings) is silent. Worth a one-line note ininitoutput. --use-existing-planreads stale plan.json afterreview --update-statusrewrites plan.md. The fix is a single intermediateplan --jsoncall. Either the docs should make this explicit, orrunshould auto-detect that plan.md is newer than plan.json.refresh-manifestis required after any edit under.repers/scripts/orverify-installfails. This is by design (tamper detection) but not surfaced in the daily-workflow docs.shippingreportsok: falsefor normal mid-task state — dirty tree + missing hook + missing optional backends. These should arguably be warnings, not errors, until the task is officially "released".
None of these are blockers; all are documentable / fixable in v0.2.
- The R-P-E-R-S contract is end-to-end runnable today using only the shipped CLI (no extra tooling, no SDK).
- Parallel multi-agent dispatch via the manifest works with any agent
runtime that can read a JSON file and write a
step_result_v1file — here demonstrated with Claude's Agent tool; the same contract would drive Codex / Gemini CLI / a homemade dispatcher equally well. - The harness's stated separation of concerns (planner vs. dispatcher vs. workers vs. reviewer vs. supervisor) maps cleanly onto distinct CLI subcommands; each can be re-invoked independently for retries and updates.