Plan-then-build AI coding for Claude Code & Codex CLI.
/map-plandecomposes your task into small, reviewable subtasks you approve before any code is written;/map-efficientimplements the approved plan, subtask by subtask. The AI still writes the code — you keep the architecture, scope, and review.
Most AI agents rush straight to code before they understand the task — so you get fast wrong answers and silent rework. MAP inserts a plan you approve first, then implements it in reviewable steps:
idea -> prompt -> code -> hope # without MAP
SPEC -> PLAN -> TEST -> CODE -> REVIEW -> LEARN # with MAP
You drive the whole loop with two core commands plus three gates:
/map-plan "add rate limiting to the public API" # 1. PLAN — decompose the task; you approve before any code
/map-efficient # 2. BUILD — implement the approved plan, subtask by subtask
/map-check # 3. CHECK — quality gates against the plan
/map-review # 4. REVIEW — semantic review vs spec, tests, and diff
/map-learn # 5. LEARN — save the gotchas for next session
- Start with
/map-planfor anything non-trivial — it clarifies behavior and splits the work into contract-sized subtasks. - Already scoped? Go straight to
/map-efficient. - Tiny edit?
/map-planoff-ramps you to a direct edit or/map-fastinstead of forcing full planning.
Codex CLI users invoke the same skills with
$:$map-plan,$map-efficient,$map-check. See the Usage Guide.
1. Install
uv tool install mapify-cli
# or with pip
pip install mapify-cli2. Initialize your project
Claude Code is the default provider:
cd your-project
mapify init
claudeCodex CLI is also supported:
cd your-project
mapify init . --provider codex
codexThen enable the Codex hook manually: run /hooks, select PreToolUse, press t to toggle it on, then press Esc. If your Codex version does not support the hooks feature key yet, start it with codex --enable codex_hooks or upgrade Codex first (upgrading is recommended).
3. Run the loop
/map-plan define the behavior and split the task
/map-efficient implement the approved plan
/map-check
/map-review
/map-learn
That's the whole golden path. Everything below explains why it works and when to reach for it.
Ad-hoc prompting feels fast on simple tasks. On complex systems it creates a different problem: code appears quickly, but the engineering process disappears.
Common failure modes:
- AI silently makes architecture decisions you did not approve.
- One prompt produces a large diff that is hard to review.
- Tests are written around the generated implementation, including its mistakes.
- The output compiles, but you cannot explain why the design is correct.
- The next session forgets the gotchas you already paid to discover.
MAP moves engineering judgment earlier: write down the behavior, split the work into small contracts, verify each stage, review against the spec, and save lessons for the next run.
| Good fits | Poor fits |
|---|---|
| Complex backend features | Typos and tiny edits |
| Kubernetes controllers and operators | Small one-off scripts |
| Internal platform tooling | Product ideas where the desired behavior is still unknown |
| API, CRD, or domain-model changes with invariants | Broad rewrites without clear boundaries |
| Refactoring with a meaningful test harness | Tasks cheaper to do directly than to plan |
After a good first workflow, you should see:
- a written plan or spec before implementation starts;
- small implementation contracts instead of one giant AI diff;
- verification and review artifacts under
.map/<branch>/; - review comments focused on correctness and semantics, not formatting noise;
/map-learnpreserving project rules, gotchas, and handoffs for future sessions.
MAP review is useful, but it is not a replacement for engineering judgment. Serious changes still need human review. The goal is to make that review smaller, earlier, and better grounded.
The DevOpsConf 2026 case study applies this process to a production Kubernetes Project Operator, not a toy CRUD app:
- human estimate: 90 days;
- MAP-style delivery: 7 days;
- workflow:
SPEC -> PLAN -> TEST -> CODE -> REVIEW -> LEARN; - small reviewable PRs instead of one giant generated diff;
- tests before implementation for critical pieces;
- semantic bugs caught in review before merge.
| Command | Use For |
|---|---|
/map-plan |
Start here for non-trivial work; clarify behavior and decompose tasks |
/map-efficient |
Implement an approved plan or already-scoped task |
/map-fast |
Small, low-risk changes where full planning would be overhead |
/map-check |
Quality gates, verification, and artifact checks |
/map-review |
Pre-commit semantic review against the plan, tests, and diff |
/map-learn |
Capture project memory and reusable lessons |
/map-debug |
Bug fixes and debugging |
/map-task |
Execute a single subtask from an existing plan |
/map-tdd |
Test-first implementation workflow |
/map-release |
Package release workflow |
/map-resume |
Resume interrupted workflows |
Canonical MAP flows:
- Standard:
/map-plan->/map-efficient->/map-check->/map-review->/map-learn - Full TDD:
/map-plan->/map-tdd->/map-check->/map-review->/map-learn - Targeted subtask TDD:
/map-plan->/map-tdd ST-001->/map-task ST-001-> ... ->/map-check->/map-review->/map-learn
These flows maintain branch-scoped artifacts under .map/<branch>/ — blueprint.json (subtask size/concern contracts), code-review-001.md, verification-summary.md, pr-draft.md, run dossiers, and more — so research, review lineage, and verification survive context resets.
- Daily-driver speed — optimized for repeated use, not occasional demo workflows. Structured enough to prevent chaos, lightweight enough to keep token and time cost under control.
- Reviewable diffs —
/map-planand/map-efficientrequire per-subtask size, concern, and constraint metadata, then validateblueprint.jsonbefore implementation, so oversized or mixed-concern plans fail early instead of surprising reviewers later. - Gates that check the plan, not vibes —
/map-checkand/map-reviewvalidate against the spec, tests, and diff instead of asking whether code "looks fine". - Clean-room review —
/map-reviewauto-bundles spec, plan, tests, verification, and coverage evidence into a single durable input (.map/<branch>/review-bundle.json);--detachedopens a read-only worktree for inspection without touching your branch. - Project memory —
/map-learnturns hard-won fixes and gotchas into reusable context, so the next session doesn't relearn them.
More under the hood (calibrated effort, mutation boundaries, token budgets, retry quarantine, run-health diagnostics, skill IR audit)
- Calibrated workflow effort — each shipped slash skill declares a
thinking_policyandparallel_tool_policy, so lightweight commands stay direct while planning, review, and release workflows reserve deeper reasoning and parallel fan-out for the stages that benefit. Non-release prompts use targeted guardrails instead of blanket all-caps prohibition blocks, reducing over-triggered agents and tool calls while keeping true hard stops explicit. - Mutation boundary constraints — write-capable Claude and Codex surfaces tell agents not to edit unrelated files, add or upgrade dependencies, or refactor neighboring code unless the current subtask requires it. Broader scope is reported as a blocker or tradeoff instead of silently expanding the diff.
- Context-first prompt envelopes — high-context
/map-plan,/map-efficient,/map-debug, and/map-reviewprompts wrap branch artifacts in XML-style<documents>, then state the<task>and<expected_output>, so specs, diffs, logs, and schemas stay separated for the model. - Contract-sized subtasks — blueprints require
expected_diff_size,concern_type,one_logical_step,hard_constraints,soft_constraints, andcoverage_map. Hard constraints must be owned incoverage_mapand cited in the owning subtask; soft constraints can be traded off only with explicittradeoff_rationale. - Token budget report — Actor and review prompt builders append active-path budget decisions to
.map/<branch>/token_budget.json(before/after estimated tokens, clipped sections, source artifacts). The operator breadcrumb for diagnosing missing context. - Clean retry quarantine — after repeated Monitor rejection, write-capable workflows switch the next attempt into clean-retry mode using
.map/<branch>/retry_quarantine.json(constraints, required evidence, do-not-repeat feedback) instead of raw failed-session context. - Run health report — workflows write
.map/<branch>/run_health_report.jsonduring closeout: terminal status, step progress, retry counters, artifact presence, hook-injection status. CI can fail inconsistent closeouts withpython3 .map/scripts/map_step_runner.py validate_run_health_report. - Compact recovery surface —
/map-resumekeeps the active recovery flow short and moves low-frequency notes toresume-reference.md, so recovery after/clearor context exhaustion gives the next checkpoint action without loading the whole appendix. - Skill IR audit — release checks lower shipped Claude and Codex
SKILL.mdfiles into a typedSkillIR, verify content hashes, catch unsupported frontmatter, reject missing supporting-file links, and block injection-like instructions beforemapify initcopies surfaces into user repos.
MAP orchestrates specialized roles through slash commands and skills:
TaskDecomposer -> breaks goals into subtasks
Actor -> implements scoped tasks
Monitor -> validates quality and blocks invalid output
Predictor -> analyzes impact for risky changes
Learner -> captures reusable project memory
For Claude Code, MAP slash surfaces live in .claude/skills/map-*/SKILL.md files created by mapify init. For Codex CLI, mapify init . --provider codex creates .agents/skills/, .codex/agents/, .codex/config.toml, hooks, and shared .map/scripts/.
MAP is inspired by the MAP cognitive architecture (Nature Communications, 2025), which reported a 74% improvement on planning tasks. The CLI turns that idea into a practical software-development workflow.
Context-compression policy, Stack Overflow for Agents (SOFA), and other init flags
Context-compression policy (controls the /compact nudge; default never — opt-in):
mapify init . --compression never # default — no nudge
mapify init . --compression auto # nudge at threshold
mapify init . --compression aggressive # nudge at 0.4 x threshold
mapify init . --compression-threshold 250000 # Opus 1M / 50+ subtask plansActor and reviewer prompts always carry the full bundled context — context-block truncation was removed. If the conversation grows beyond your model's window, opt into /compact via --compression auto or trigger it manually. See docs/USAGE.md#context-budget-policy.
Stack Overflow for Agents (SOFA) read-only prior-art search — off by default, no network or credentials unless you enable it:
mapify init . --sofa # opt-in: enable the map-so-search skillThis writes sofa.enabled: true to .map/config.yaml and adds .sofa/ to your .gitignore. Without the flag, no SOFA code path runs. See the SOFA usage guide.
| Guide | Description |
|---|---|
| Installation | All install methods, PATH setup, troubleshooting |
| Usage Guide | Workflows, examples, cost optimization, playbook |
| Architecture | Agents, MCP integration, customization |
| Platform Spec | Platform refactor roadmap, codebase analysis |
- Command not found -> Run
mapify initin your project first. - Agent errors -> Check
.claude/agents/has all shipped agent.mdfiles, or runmapify doctor. - Poor output on a complex task -> Start with
/map-planand feed/map-efficientthe approved plan instead of asking it to infer the architecture. - More help ->
Improvements welcome: prompts for specific languages, new agents, provider integrations, and CI/CD workflow support.
MIT
Start with /map-plan. Keep the model inside your engineering process, not the other way around.
