Claude Forge

A Claude Code plugin that reads your project and creates a team of specialized agents for it.

It looks at the codebase, studies how PR reviews work in the project, and generates agents that understand the project's actual conventions, not generic ones.

What It Does

You get a small team of agents, each focused on one job, built from the project's real code and review patterns.

graph TD
    subgraph input [" What you provide "]
        A["Your task description"]
        B["Project codebase"]
        C["PR review access"]
    end

    subgraph forge [" Forge analyzes everything "]
        D["Code patterns\nand conventions"]
        E["Reviewer comments\nand priorities"]
        F["Test patterns\nand quality bar"]
    end

    subgraph agents [" Generated agent team "]
        G["Researcher"]
        H["Builder"]
        I["Reviewer"]
        J["Challenger"]
        K["Submitter"]
    end

    subgraph pipeline [" Orchestrated pipeline "]
        L["Research"] --> M["Build"] --> N["Review\nand fix"] --> O["Challenge\nand fix"] --> P["Submit"]
    end

    A --> D
    B --> D
    B --> F
    C --> E

    D --> G & H
    E --> J
    F --> I

    G -.-> L
    H -.-> M
    I -.-> N
    J -.-> O
    K -.-> P

    style input fill:#f0f4f8,stroke:#c9d6e3,color:#2d3748
    style forge fill:#ebf5ff,stroke:#90cdf4,color:#2b6cb0
    style agents fill:#f0fff4,stroke:#9ae6b4,color:#276749
    style pipeline fill:#fffbeb,stroke:#fbd38d,color:#975a16

Setup

Prerequisites

Make sure you have these installed:

Tool	Install	Purpose
Claude Code	See docs	Runs the plugin and agents
GitHub CLI	`brew install gh`	PR review analysis and submission

After installing the GitHub CLI, authenticate:

gh auth login

Installation

This plugin is not yet on the official Claude Code marketplace. For now, install using one of the options below.

Option A: As a plugin (recommended)

git clone https://github.com/pablocaeg/claude-army.git
claude plugin install ./claude-army --scope user

Option B: Standalone agent (no plugin system needed)

git clone https://github.com/pablocaeg/claude-army.git
cp claude-army/forge.md ~/.claude/agents/

Verify

Open Claude Code in any project and run:

/claude-army:analyze

If it responds, you're set. If using Option B, use @forge instead.

Usage

Step 1: Analyze the Project

/claude-army:analyze

Reads the codebase and PR reviews. Extracts the lead reviewer's actual comments and ranks them by how often they come up. Saves the analysis to .context/forge-analysis.md.

You can optionally prepare a .context/ folder with additional information:

your-project/
├── .context/
│   ├── task.md              # What you want to accomplish
│   ├── research/            # Background docs, specs, references
│   └── examples/            # Examples of desired output
└── src/

Step 2: Create the Agent Team

/claude-army:create-team

Uses the analysis to generate agents in ~/.claude/agents/. They live outside the plugin so they can spawn each other during orchestration.

Step 3: Run the Pipeline

@[project]-orchestrator [describe your task]

Runs each agent in order with human checkpoints at key decisions.

Step 4: Pre-check Before Submitting (optional)

/claude-army:challenge

Simulates the lead reviewer's feedback on your current changes before you submit.

What Gets Created

Every team is different. The forge reads the actual codebase and builds agents around its patterns.

Agent	What it does	What makes it project-specific
Researcher	Gathers and verifies information	Knows what the builder needs, verifies against project requirements
Builder	Writes code	Uses templates from real project files, follows naming and structure
Reviewer	Checks against standards	Uses the project's linter config, test conventions, PR checklist
Challenger	Simulates the lead reviewer	Built from their actual comments, ranked by frequency
Submitter	Creates PRs	Matches the format from the project's best accepted PRs
Orchestrator	Runs the pipeline	Chains agents with test runs between phases and human approval gates
Expert	Answers codebase questions	Architecture map and entry points from the actual project
Test Writer	Writes tests	Uses the project's assertion library, fixture patterns, coverage target

Interesting Parts

Reviewer Modeling

One thing I found useful while building this: reading through a project's PR reviews teaches you more than reading the source code. The forge captures that by extracting the lead reviewer's actual comments and building an agent that raises the same concerns before you submit.

graph LR
    subgraph collect [" Collect "]
        A["Read 20 to 40+\nPR reviews"]
    end

    subgraph extract [" Extract "]
        B["Lead reviewer\ncomments"]
    end

    subgraph rank [" Rank by frequency "]
        C["Tier 1: raised on\n50%+ of PRs"]
        D["Tier 2: raised on\n25 to 50%"]
        E["Tier 3: raised on\n10 to 25%"]
    end

    subgraph result [" Challenger agent "]
        F["Simulates review\nwith real quotes"]
        G["Routes each issue\nto the right agent"]
    end

    A --> B --> C & D & E --> F --> G

    style collect fill:#f0f4f8,stroke:#c9d6e3,color:#2d3748
    style extract fill:#ebf5ff,stroke:#90cdf4,color:#2b6cb0
    style rank fill:#fffbeb,stroke:#fbd38d,color:#975a16
    style result fill:#f0fff4,stroke:#9ae6b4,color:#276749

The idea is to catch likely feedback before submitting, which can save a few review rounds.

Self-Verifying Research

I kept running into a problem where the research agent would confidently return wrong data, and that wrong data would end up in the code. So the research agent now verifies its own findings before passing them on.

What it verifies	How
Algorithms	Computes step by step against known valid data
Sources	Fetches every URL to confirm it loads and has the right content
Facts	Cross-references from 2+ independent official sources
Status	Tags every finding: verified, partial, or unverified

Orchestrated Execution

Agents run in a managed pipeline, not independently. Each phase depends on the previous one. The orchestrator runs tests between phases and pauses for human approval at key moments.

sequenceDiagram
    actor You
    participant O as Orchestrator
    participant R as Researcher
    participant B as Builder
    participant T as Test Writer
    participant V as Reviewer
    participant C as Challenger
    participant S as Submitter

    rect rgb(240, 244, 248)
        Note over O,R: Phase 1: Research
        O->>R: Gather and verify information
        R->>R: Cross-reference sources
        R->>R: Verify URLs
        R->>R: Prove algorithms step by step
        R-->>O: Findings with verification status
    end

    O-->>You: Research complete. Confirm?
    You->>O: Confirmed

    rect rgb(240, 248, 240)
        Note over O,T: Phase 2: Build and Test
        O->>B: Create code from findings
        B-->>O: Code ready
        O->>T: Write tests for new code
        T-->>O: Tests written
        O->>O: Run test suite
    end

    rect rgb(248, 245, 240)
        Note over O,V: Phase 3: Review and Fix
        O->>V: Check all changes against standards
        V-->>O: 2 blockers, 1 warning
        O->>B: Fix blocker in validation
        B-->>O: Fixed
        O->>T: Add missing test case
        T-->>O: Test added
        O->>O: Re-run tests
    end

    rect rgb(248, 240, 240)
        Note over O,C: Phase 4: Challenge and Fix
        O->>C: Simulate lead reviewer
        C-->>O: 3 questions with fix routing
        C-->>O: Q1 needs research, Q2 needs code fix
        O->>R: Verify Q1 claim
        R-->>O: Confirmed, no change needed
        O->>B: Fix Q2
        B-->>O: Fixed
        O->>O: Re-run tests
    end

    O-->>You: All checks pass. Submit?
    You->>O: Go ahead

    rect rgb(240, 240, 248)
        Note over O,S: Phase 5: Submit
        O->>S: Create PR
        S-->>You: PR URL
    end

Things I Learned Building This

Lesson	Details
Project-specific prompts matter	Generic agents produce generic output. Grounding them in the actual codebase makes a big difference.
PR reviews are the best teacher	Reading 20+ reviews taught the agents more than reading every source file.
Research needs verification	If the research agent gets something wrong, everything downstream is wrong too.
Restrict tools per agent	Read-only agents should not be able to write files. Fewer tools, fewer mistakes.
Humans should approve key decisions	The pipeline pauses after research and before submission. Full autonomy is tempting but risky.
Say what NOT to do	Telling agents what to avoid prevents mistakes better than only telling them what to do.
First version is always wrong	Test agents on real work, find what breaks, fix them. Repeat.

Project Structure

claude-army/
├── .claude-plugin/
│   └── plugin.json            # Plugin manifest
├── skills/
│   ├── analyze/
│   │   └── SKILL.md           # /claude-army:analyze
│   ├── create-team/
│   │   └── SKILL.md           # /claude-army:create-team
│   └── challenge/
│       └── SKILL.md           # /claude-army:challenge
├── templates/                 # Agent archetypes used by create-team
│   ├── researcher.md
│   ├── builder.md
│   ├── reviewer.md
│   ├── challenger.md
│   ├── submitter.md
│   ├── orchestrator.md
│   ├── expert.md
│   └── test-writer.md
├── forge.md                   # Standalone agent (alternative to plugin)
├── docs/
│   ├── context-guide.md
│   ├── methodology.md
│   └── customization.md
├── LICENSE
└── README.md

Requirements

Requirement	Purpose
Claude Code	Runs the plugin and agents
GitHub CLI (`gh`)	PR review analysis and submission
A project to work on	The forge needs a real codebase to analyze

Limitations

This is an experiment, not a production tool. Some honest caveats:

Works with Claude Code only. Other AI coding tools are not supported.
Tested thoroughly on one project so far. The methodology works, but more testing would help.
The forge produces a solid first draft of agents, but you will want to refine them after testing on real work.
Not fully autonomous. You still review, approve, and guide the pipeline.

Contributing

This is a work in progress. If you try it on a project and find ways to improve it, issues and PRs are welcome.

MIT License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Claude Forge

What It Does

Setup

Prerequisites

Installation

Verify

Usage

Step 1: Analyze the Project

Step 2: Create the Agent Team

Step 3: Run the Pipeline

Step 4: Pre-check Before Submitting (optional)

What Gets Created

Interesting Parts

Reviewer Modeling

Self-Verifying Research

Orchestrated Execution

Things I Learned Building This

Project Structure

Requirements

Limitations

Contributing

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.claude-plugin		.claude-plugin
docs		docs
skills		skills
templates		templates
LICENSE		LICENSE
README.md		README.md
forge.md		forge.md

Folders and files

Latest commit

History

Repository files navigation

Claude Forge

What It Does

Setup

Prerequisites

Installation

Verify

Usage

Step 1: Analyze the Project

Step 2: Create the Agent Team

Step 3: Run the Pipeline

Step 4: Pre-check Before Submitting (optional)

What Gets Created

Interesting Parts

Reviewer Modeling

Self-Verifying Research

Orchestrated Execution

Things I Learned Building This

Project Structure

Requirements

Limitations

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages