Skip to content

Improve AI-agent friendliness of the codebase: tracking issue #4832

Description

@envestcc

Background

AI coding agents (Claude Code, Copilot, Cursor, etc.) are increasingly being used to develop and maintain this codebase. Unlike humans, they have two structural constraints:

  1. Bounded context — they cannot hold the whole repo in mind at once.
  2. Feedback-loop dependent — their output quality is capped by the quality of the signals they receive (tests, lints, types).

A short audit of the current state shows that the engineering foundations (CI, linting, tests, commit hygiene) are solid, but the "agent-facing" context layer (machine-readable conventions, invariants, decision rationale) is largely absent. This issue tracks proposed improvements to close that gap, so AI agents can contribute reliably without amplifying bugs or drifting from project conventions.

This is a tracking issue. Each checkbox below can be picked up as its own PR.

Audit summary

Dimension Current state Gap
Verification golangci-lint (23 linters), 365 test files, full PR CI, gosec, codecov No fuzz tests, no coverage threshold, pre-commit only scans secrets, no flaky-test detection
Context README, PR/issue templates No `CLAUDE.md` / `AGENTS.md`, no `docs/adr/`, no package-level godoc on core packages
Structure 191 packages, ~142K LOC, conventional commits 499 `interface{}` occurrences, 144 `panic()` in non-test code, 186 `init()` functions
Tooling Makefile, release-drafter, gosec workflow No `CONTRIBUTING.md`, no agent config (`.claude/`, `AGENTS.md`), no dependabot

P0 — Foundational (highest ROI)

  • Root `CLAUDE.md` / `AGENTS.md` — a concise constraints file at repo root. Should include: hard rules (e.g., hardfork-gated changes must update `genesis.yaml` tests; receipt-log field order is frozen post-hardfork; no `time.Now()` in consensus/state paths), frozen directories, encoding red lines, and a pointer to per-package guidance.

    • Acceptance: file exists, ≤ 200 lines, reviewed by a maintainer for accuracy.
  • Per-package `CLAUDE.md` for hot spots — one file per critical package documenting "why this is the way it is" and "what changes here ripple elsewhere". Minimum set:

    • `action/protocol/staking/`
    • `action/protocol/rewarding/`
    • `blockchain/`
    • `state/`
    • `consensus/`
    • Acceptance: each file ≤ 100 lines, focused on invariants and cross-cutting impact, not tutorial content.
  • Fuzz tests for encode/decode paths — native Go `func FuzzXxx` for:

    • `crypto/` (signature, hashing roundtrips)
    • `action/` (action serialization)
    • `state/` (state codec)
    • Acceptance: fuzz corpus seeded, CI runs in `-fuzztime=30s` mode on PR, longer in nightly.
  • `CONTRIBUTING.md` — dev setup, local test workflow, hardfork-aware change checklist, and an explicit "AI-agent workflow" section linking to `CLAUDE.md`.

    • Acceptance: a new contributor (human or agent) can go from clone → green local test in < 15 minutes following only this doc.

P1 — High-impact

  • Upgrade pre-commit hooks — extend `.pre-commit-config.yaml` to run `golangci-lint --fast`, `gofmt`, `go vet`, and a fast unit-test subset on changed packages. Currently only secrets/private keys are scanned.

  • Coverage threshold gate — configure codecov to fail PRs that drop coverage by more than 1% from the base. Start permissive, tighten over time.

  • `docs/adr/` directory — start with retroactive ADRs for major past decisions (e.g., staking v2/v3 migration, rewarding fund split, hardfork naming convention, MPT migration plan). Template included.

  • Package-level godoc on core interfaces — package comments and contract docstrings (preconditions, postconditions, concurrency, ordering guarantees) for `Protocol`, `StateManager`, `ActionHandler`, and similar central interfaces.

  • Repo-level `.claude/` config — add `settings.json` with sensible allowlists for read-only commands, plus a starter skill or two for repeated workflows (e.g., "add a new staking protocol method", "verify hardfork wiring").

  • Benchmark regression gate — extend the existing crypto benchmarks to cover hot paths in `state/` and `action/protocol/staking/`, with benchstat-based regression detection in CI.

P2 — Long-term improvements

  • Reduce `init()` count — 186 init functions is high. Migrate global-state setup to explicit constructors / DI where practical. Track count over time.

  • `interface{}` → generics — 499 occurrences. Type-strong APIs are guardrails for agents. Migrate package-by-package.

  • Domain glossary `docs/glossary.md` — define epoch, bucket, endorsement, probation, NFT bucket, staking v2 vs v3, etc. AI agents conflate these terms and introduce semantic bugs.

  • Dependabot / Renovate — automated dependency updates, especially for security patches.

Out of scope

  • Large-scale refactors that change behavior. All improvements above should be additive or risk-free.
  • Adopting a specific AI vendor's tooling exclusively — files like `CLAUDE.md` and `AGENTS.md` are conventions read by multiple agents; we should support both.

How to contribute

Each unchecked box is a candidate for its own PR. Comment on this issue to claim a task. Reference this issue in your PR (e.g., `Refs #`).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions