autorize

autorize is a generic iterative-improvement harness. You point it at a project, a scoring command, and an agent CLI, and it runs the agent in sandboxed git worktrees against the score — keeping improvements, discarding regressions — until a deadline fires.

It generalizes the autoresearch pattern into a small Rust CLI you can point at any repo.

How it works

For each iteration, autorize:

Creates a fresh git worktree off the autorize/<name> tracking branch.
Builds a prompt from your program.md, the boundary rules, any operator guidance (autorize tell), and the last 10 iteration records (with a per-outcome reason and, if enabled, a model-written summary of each attempt).
Spawns your agent (any CLI — Claude Code, a shell script, anything) inside the worktree with a hard wall-clock budget. On timeout the whole process group gets SIGTERM, then SIGKILL after 5 s.
Stages the agent's changes and rejects the iteration if its diff touches a deny_paths glob.
Runs your scoring command (raw float, regex capture, or JSONPath) and compares against the best score seen so far.
Better? Commits onto autorize/<name> and advances the tracking branch. Worse / no-op / denied / invalid? Discards the worktree.
Appends an IterationRecord to iterations.jsonl and rewrites state.json atomically so you can Ctrl-C (or crash) at any point and autorize resume picks up cleanly.

The loop exits when the total deadline fires, max_iterations is hit, or max_consecutive_noops is reached.

Install

Supported platforms: Linux (x86_64-unknown-linux-gnu) and macOS (aarch64-apple-darwin, Apple Silicon).

From crates.io:

cargo install autorize

Prebuilt binary (from the latest GitHub Release):

# Pick your target:
TARGET=x86_64-unknown-linux-gnu       # or: aarch64-apple-darwin

# Resolve the latest tag, then download + extract:
TAG=$(curl -fsSL -o /dev/null -w '%{url_effective}' \
  https://github.com/wbbradley/autorize/releases/latest | sed 's#.*/tag/##')
curl -fsSL "https://github.com/wbbradley/autorize/releases/download/${TAG}/autorize-${TAG}-${TARGET}.tar.gz" \
  | tar -xz
./autorize --version

Or browse https://github.com/wbbradley/autorize/releases/latest and grab the archive for your target by hand.

From source:

cargo install --path .

Quickstart

# 1. Scaffold an experiment under .autorize/<name>/
autorize init myexp

# 2. Edit .autorize/myexp/config.toml and .autorize/myexp/program.md
#    - point `objective.command` at your scoring script
#    - point `agent.command` at your agent CLI
#    - set a deadline (`total_budget = "4h"` or `deadline = "..."`)

# 3. Commit your repo (autorize refuses dirty trees by default), then run:
autorize run myexp

# 4. Check progress from another shell:
autorize status myexp

# 5. If the loop dies, restart it:
autorize resume myexp

Use with Claude Code

This repo ships a Claude Code skill at skills/autorize/ that walks you through scaffolding an experiment — it asks about your objective, scoring command, agent CLI, and schedule, then drafts .autorize/<name>/config.toml, program.md, and any helper scoring script for your review before writing.

Install once (user-global, applies to every repo you open):

mkdir -p ~/.claude/skills
cp -r skills/autorize ~/.claude/skills/

Or per-project (only this repo):

mkdir -p .claude/skills
cp -r skills/autorize .claude/skills/

Then, from a Claude Code session in any repo with autorize on PATH, invoke /autorize. The skill prints autorize llms for context, interviews you, and stops at "ready to autorize run <name>" — it never starts the loop.

Subcommands

Command	What it does
`autorize init <name>`	Scaffold `.autorize/<name>/{config.toml,program.md}`.
`autorize run <name>`	Run the loop until deadline / cap / noop streak. `--fresh` starts another run building on the prior best.
`autorize status <name>`	One-shot summary from `state.json` + `iterations.jsonl`.
`autorize list <name>`	Dump every iteration as markdown (oldest-first, one section per iteration with its summary); colorized on a TTY, plain markdown when piped. `--color <auto\|always\|never>` overrides detection.
`autorize tell <name> <message>`	Append operator guidance; the running loop injects it into the next iteration's prompt (see below).
`autorize resume <name>`	Recover after a crash; any in-progress iter is recorded as `killed` and the loop continues.
`autorize clean <name>`	Tidy a finished/abandoned experiment: detach any worktree still holding the tracking branch checked out (the branch ref is preserved), drop stale staged indexes, prune dead worktree registrations (`--remove-worktrees` also deletes kept `wt/` checkouts). Leaves the log and records intact.
`autorize llms`	Print an exhaustive agent-targeted markdown reference (config schema, on-disk layout, `IterationRecord`, state machine).

autorize run accepts --allow-dirty if you need to start with uncommitted changes outside .autorize/, and --fresh to start another run on a finished experiment (see below).

Starting another run

When a run finishes (deadline, max_iterations, or the consecutive-noop streak), re-running autorize run <name> is a no-op — it reloads the saved state and re-hits the same stop condition. To do another batch of work that builds on what you already have, pass --fresh:

autorize run myexp --fresh

--fresh recomputes the deadline from schedule, resets the per-run max_iterations budget and the consecutive-noop streak, and refreshes started_at — while preserving the prior best_score/best_iter, the autorize/<name> branch and its tip, and the full iterations.jsonl history. New iterations keep comparing against the prior best and keep numbering upward. It is a no-op on a never-run experiment, and is refused (use autorize resume) if an iteration is mid-flight. An already-past absolute schedule.deadline errors instead of looping; switch to total_budget or edit the deadline first.

Steering a run

To redirect a run in flight — without stopping it — append operator guidance:

autorize tell myexp "stop tuning the series — try a spigot algorithm instead"

tell appends a structured entry to .autorize/myexp/guidance.jsonl. The running loop re-reads that file at the top of every iteration and injects all entries into a prominent ## Operator guidance section of the agent's prompt, framed as authoritative direction. The message shows up in the next iteration and persists thereafter. The file is also safe to hand-edit; a missing/empty file simply renders no section. In v1 all guidance persists and is shown every iteration.

Config (`.autorize/<name>/config.toml`)

[experiment]
name = "myexp"
description = "..."

[objective]
command   = "bash score.sh"        # prints the score to stdout
direction = "min"                  # "min" | "max"
parse     = { kind = "float" }     # or { kind = "regex", pattern = "score=([0-9.]+)" }
                                   # or { kind = "jq",    path = ".metrics.loss" }
timeout   = "60s"
fail_mode = "invalid"              # "invalid" | "worst" | "abort"

[boundaries]
allow_paths = ["src/**/*.py"]      # prompt-only in v1
deny_paths  = [".autorize/**"]     # ENFORCED via diff

[setup]    { command = "",  timeout = "5m" }
[teardown] { command = "",  timeout = "1m" }

[iteration]
budget                = "5m"
max_iterations        = 0          # 0 = unbounded
keep_worktrees        = false
max_consecutive_noops = 5

[schedule]
total_budget = "4h"                # OR (exactly one):
# deadline   = "2026-05-21T09:00:00-07:00"

[agent]
command     = "claude --print {prompt_file}"   # {prompt_file}, {workdir}, {iter}
workdir_var = "AUTORIZE_WORKDIR"
stdin       = "none"                            # "none" | "prompt"

[agent.env]
ANTHROPIC_API_KEY = "$ANTHROPIC_API_KEY"

[summarize]                                     # on by default; recap each
enabled = true                                  #   iteration with a cheap model
command = 'claude --model haiku --print --tools "" --system-prompt "You are a terse summarizer. Output exactly 1-2 sentences naming the change and why it moved the score. No preamble, no markdown, no questions, no offer of further help."'
timeout = "60s"
stdin   = "prompt"                              # prompt piped on stdin (default)

When [summarize] is enabled, each iteration's recap is surfaced to the agent in later prompts under ## Recent attempt summaries (so it can learn from discarded attempts). At the top of every autorize run / resume, autorize also backfills summaries for any records still missing one — those written before you enabled [summarize], or whose summarize step failed — by replaying the persisted iter-NNNN/ artifacts (changes.diff, agent.stdout, agent.stderr). It is best-effort and skips noops and records whose artifacts are gone; the first run after enabling summaries may therefore fire several one-time model calls.

program.md lives next to config.toml and is freeform instructions for the agent — included verbatim at the top of every prompt.

On-disk layout

<repo>/
  logs/autorize.log        # central append-only run log (narrative + teed child stdio)
  .autorize/<name>/
    config.toml
    program.md
    state.json             # atomic checkpoint of loop state
    iterations.jsonl       # durable append-only log
    guidance.jsonl         # operator guidance from `autorize tell` (hand-editable)
    iter-0001/
      prompt.md            # what the agent saw
      changes.diff         # captured diff
      agent.stdout
      agent.stderr
    iter-0002/
    ...

logs/ is created on startup (gitignore it). RUST_LOG tunes verbosity (default info). At info the log is a forensic audit trail — every git call, subprocess spawn, and filesystem mutation is recorded (dozens of lines per iteration; agent.env secrets are never logged). Use RUST_LOG=warn to quiet it (also hides the run narrative).

The tracking branch autorize/<name> records every merged iteration as a single commit, so git log autorize/<name> is your improvement history and git diff main..autorize/<name> is the cumulative change.

Example

See examples/pi-digits/ for an end-to-end demo where a mock agent nudges a number in value.txt toward π:

cp -r examples/pi-digits/. /tmp/pi-demo
cd /tmp/pi-demo
git init -b main
git -c user.email=a@b -c user.name=a add .
git -c user.email=a@b -c user.name=a commit -m init
autorize run pi

Status

v1 is feature-complete on Linux and macOS (Apple Silicon). Out of scope for v1: parallel iterations, Pareto scoring, web/TUI, token accounting, retry/backoff, remote storage, allow-path enforcement (allow_paths is prompt-only).

License

AGPL-3.0-or-later.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.github/workflows		.github/workflows
examples/pi-digits		examples/pi-digits
skills/autorize		skills/autorize
src		src
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
COMPLETED.md		COMPLETED.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
PLAN.md		PLAN.md
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

autorize

How it works

Install

Quickstart

Use with Claude Code

Subcommands

Starting another run

Steering a run

Config (`.autorize/<name>/config.toml`)

On-disk layout

Example

Status

License

About

Uh oh!

Releases 13

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

autorize

How it works

Install

Quickstart

Use with Claude Code

Subcommands

Starting another run

Steering a run

Config (.autorize/<name>/config.toml)

On-disk layout

Example

Status

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 13

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Config (`.autorize/<name>/config.toml`)

Packages