Skip to content

Stage 0 discovery picks up nested test-fixture packages (e.g. e2e-cli/fixtures/code) #55

@RichardHightower

Description

@RichardHightower

Symptom

When designdoc 1.2.1-pre ran against the agent-brain monorepo, the generated SYSTEM_DESIGN.md describes a code package as:

code — Lightweight arithmetic utility library providing four free functions (add, subtract, multiply, divide) and a Calculator class that delegates to them while maintaining an audit-trail history.

There is no "arithmetic utility library" in agent-brain. The source is a test-fixture file at:

/Users/richardhightower/clients/spillwave/src/agent-brain/e2e-cli/fixtures/code/calculator.py

…which is a synthetic Python file used by agent-brain's end-to-end CLI tests to validate the agent-brain indexer's behavior on Python code. Stage 0 (src/designdoc/stages/s0_discover.py) walked into the fixtures dir, Stage 1 extracted signatures, and Stage 3 generated class docs for the fixture as if it were production code.

Why DEFAULT_EXCLUDES doesn't catch this

The current DEFAULT_EXCLUDES (after #42 / PR #43) covers .claude/, .opencode/, .devcontainer/, .idea/, .vscode/. It does not cover fixture / test-data conventions, which vary too widely across projects to enumerate.

Proposed fix (one of)

A. Manifest-rooted discovery. Restrict discovery to paths under a directory that has a pyproject.toml / package.json / Cargo.toml / etc. Test fixtures typically live outside any manifest scope.

B. User-configurable excludes via .designdoc.toml. Add an excludes = [\"e2e-cli/fixtures/**\", ...] field on the project config. Users opt in per-repo.

  • Pros: simple, explicit.
  • Cons: every user has to know to do this; first run produces noisy docs.

C. Heuristic excludes. Add patterns like */fixtures/*, */test_data/*, */__fixtures__/*, */golden/* to DEFAULT_EXCLUDES.

  • Pros: catches the common cases without user effort.
  • Cons: false positives if someone has a real package named "fixtures".

Recommend B as the immediate fix (no false positives, opt-in) and A as the longer-term solution (related to #51 monorepo manifest detection).

Evidence

  • Generated docs/gen/agent-brain/SYSTEM_DESIGN.md lists code between real packages (commands, config, contract).
  • The fixture: agent-brain/e2e-cli/fixtures/code/calculator.py has class Calculator + add / subtract / multiply / divide free functions — exactly what the docs describe.

Acceptance criteria

  • A run against agent-brain (or any repo with a similar fixture layout) does not describe e2e-cli/fixtures/** as a production package.
  • For option B: a [exclude] table in .designdoc.toml is respected by Stage 0 discovery, with a regression test.
  • For option C: the new default-exclude patterns are documented and have a regression test.

Notes

Discovered during the May 2026 trust eval. Low priority compared to #54 (transport retry) — this only affects output quality, not pipeline reliability.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions