Skip to content

Preventive spend-governance layer + PACER-native MCP server#6

Open
johnzfitch wants to merge 6 commits into
masterfrom
claude/pacer-spend-governance-50JNs
Open

Preventive spend-governance layer + PACER-native MCP server#6
johnzfitch wants to merge 6 commits into
masterfrom
claude/pacer-spend-governance-50JNs

Conversation

@johnzfitch
Copy link
Copy Markdown
Owner

What & why

pacer-cli could bill PACER and prompt before charges, but had no preventive, cumulative spend cap — nothing safe to put in front of an autonomous agent. PACER only bills retrospectively (quarterly PAA) plus a free-text client code; it never stops spend in real time. This PR adds a real-time, fail-closed cap an agent can't lift, every billable call logged, reachable as the first PACER-native MCP server.

Built to the agreed lean spec: no new runtime store — caps are fields on PacerConfig, cumulative spend is summed from the audit log, and one gate lives in cli.py. The only net-new runtime module is the (optional) MCP server.

Two corrections found during implementation

  1. The AuditLogger was dead codelog_request/log_download/get_audit_logger had zero call sites, so "reuse the audit log" required actually wiring the logger into the billable sites (still no separate ledger.py).
  2. The audit timestamp was local time despite a Z suffix (no time.gmtime converter) — fixed to true UTC so spend_today() buckets correctly across midnight.

Changes

Governance

  • config.py: per_op_cap_usd / daily_cap_usd / require_client_code on PacerConfig, overlaid from a lawyer-editable ~/.pacer/config/policy.csv (apply_policy_csv). Fail-closed: missing→conservative defaults, blank cell→safe default (never "unlimited"), unparseable→refuse billable ops and name the row (read-only still runs).
  • security.py: wire the audit logger in + UTC fix; spend_today(); a ctx-free check_spend() (raising BudgetError/MatterRequired) shared by the CLI and MCP server.
  • cli.py: enforce_spend()/_deny() gate on the 5 billable sites (docket, document, batch, pcl cases, pcl parties); records actual spend after each call; --agent (JSON errors, exit 3), --matter/--client-code override → X-CLIENT-CODE. Login stays human-in-the-loop (agent mode never prompts for / auto-unlocks the vault).
  • errors.py: budget_exceeded / matter_required / policy_invalid panels.

Court scoping

  • Slimmed bundled court-lookup.json to the 4 fields the code reads (1.67 MB → 47 KB).
  • pacer courts enable/disable/invert/... writes courts.csv (human-edited, agent-read-only), wired into PCL searches (explicit --court still wins; all-enabled/no-file = nationwide).

MCP server (optional [mcp] extra, pacer-mcp stdio entry point)

  • mcp_server.py: search_cases, search_parties, get_docket, get_document, plus read-only spend_status — every billable tool routes through the same check_spend and writes the same audit line; credentials human-provisioned only.

Cleanup + creep-guard

  • Removed dead imports/locals/F541 f-strings and the unused DocumentSorter stub class; commented the benign swallowed excepts. Added tools/audit.py (flags stubs + silent except: pass) and run it in CI.

Behavior at a glance

$ pacer --agent pcl cases -n 1:20-cv-1
{"error": "BUDGET_EXCEEDED", "operation": "Search cases", "estimated": 0.1, "spent_today": 9.95, "daily_cap": 10.0}   # exit 3

Garbage policy.csv → billable ops refuse (POLICY_INVALID, names the row); pacer courts and other read-only commands still run.

Out of scope (phase 2)

Signed chain-of-custody receipts; hosted MCP w/ OAuth; reconcile; sealed-document detection. The --matter → X-CLIENT-CODE → invoice attribution is marked TODO(unverified) and nothing depends on it.

Verification

  • 207 tests pass (160 existing + 47 new: policy, ledger/spend, gate, courts scope, MCP).
  • pyflakes src/pacer_cli clean; python tools/audit.py clean.
  • Manual: cap trips with JSON + exit 3; audit log gains UTC cost=$… client=… lines; loosening policy.csv requires a human edit (no agent write path).

https://claude.ai/code/session_01NNvhsYRVWhjfcdgaSmU5bt


Generated by Claude Code

Make pacer-cli safe to put in front of an autonomous agent: a real-time,
fail-closed spend cap an agent cannot lift, plus a PACER-native MCP server
that obeys the same cap.

Governance (no new runtime store — reuses the audit log):
- config.py: per_op_cap_usd / daily_cap_usd / require_client_code on
  PacerConfig, overlaid from a human-edited ~/.pacer/config/policy.csv
  (apply_policy_csv, fail-closed: blank=safe default, garbage=refuse+name row).
- security.py: wire the previously-dead AuditLogger in, fix its timestamp to
  true UTC (time.gmtime) so spend_today() buckets correctly; add spend_today()
  and a ctx-free check_spend() shared by the CLI and MCP server.
- cli.py: enforce_spend()/_deny() gate on the 5 billable sites (docket,
  document, batch, pcl cases, pcl parties); record actual spend after each
  call; --agent (JSON errors, exit 3), --matter/--client-code override.
- errors.py: budget_exceeded / matter_required / policy_invalid panels.

Court scoping:
- Slim bundled court-lookup.json to the 4 fields the code reads (1.67MB -> 47KB).
- courts.csv enable/disable/invert scope (pacer courts ...), wired into PCL
  searches; human-edited, agent-read-only.

MCP server (optional [mcp] extra):
- mcp_server.py: search_cases/search_parties/get_docket/get_document +
  read-only spend_status, all routed through check_spend with the same audit
  trail; credentials human-provisioned (never prompts/auto-unlocks the vault).

Cleanup + creep-guard:
- Remove dead imports/locals/f-strings and the unused DocumentSorter stub class;
  comment the benign swallowed excepts. Add tools/audit.py and run it in CI.

Tests: +47 (policy, ledger/spend, gate, courts scope, MCP). 207 passing;
pyflakes + audit clean.
Copilot AI review requested due to automatic review settings June 2, 2026 09:18
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a preventive spend-governance layer (per-op + daily caps enforced before billable PACER operations) and introduces an optional PACER-native MCP stdio server that routes all billable tools through the same governance gate and audit log.

Changes:

  • Add fail-closed spend governance: policy.csv overlay on PacerConfig, cumulative spend derived from audit logs, and a centralized gate used by both CLI and MCP.
  • Add court search scoping via courts.csv and wire it into PCL searches, plus new pacer courts ... subcommands to manage the scope.
  • Add optional MCP server (pacer-mcp) and new CI audit tool to catch stubs and silent except: pass.

Reviewed changes

Copilot reviewed 23 out of 24 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tools/audit.py New AST-based audit tool to fail CI on stubbed functions and warn on silent exception swallowing.
.github/workflows/ci.yml Run the dangling-code audit in CI.
pyproject.toml Add optional mcp extra and pacer-mcp entry point; add pyflakes to dev deps.
src/pacer_cli/config.py Add spend-cap fields to config and implement fail-closed policy.csv overlay.
src/pacer_cli/security.py Wire audit logger (UTC timestamps), add spend_today() and check_spend() primitives.
src/pacer_cli/cli.py Add spend gate (enforce_spend), audit spend recording, --agent, --matter, and wire governance into billable commands + courts subcommands.
src/pacer_cli/mcp_server.py New optional MCP server module exposing governed PACER tools and spend status.
src/pacer_cli/courts.py Add courts.csv scope read/write and enabled-court computation for PCL scoping.
src/pacer_cli/errors.py Add new error catalog entries for budget/matter/policy failures.
src/pacer_cli/downloader.py Minor cleanup and clarified swallowed-exception behavior.
src/pacer_cli/selection.py Remove unused prompt import.
src/pacer_cli/parser.py Remove unused imports.
src/pacer_cli/reader.py Remove unused imports and delete unimplemented DocumentSorter stub.
src/pacer_cli/docket_types.py Minor cleanup of f-string usage.
README.md Document spend governance, agent mode, court scoping, and MCP server usage.
examples/README.md Add example configuration documentation for policy.csv and courts.csv.
examples/policy.csv Add example policy.csv.
tests/conftest.py Redirect policy/audit log locations for test isolation.
tests/test_policy.py New tests for policy.csv overlay and fail-closed parsing.
tests/test_ledger.py New tests for audit-derived spend summing and cap checks.
tests/test_governance.py New tests for CLI agent refusals (JSON + exit 3) and cumulative spend behavior.
tests/test_courts_scope.py New tests for courts.csv scope behavior and courts subcommands.
tests/test_mcp_server.py New tests for MCP tool governance + error payload shaping without network/SDK.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/pacer_cli/cli.py
Comment on lines +242 to +246
if pages:
logger.log_download(
url, Path(operation), size_bytes=0, pages=pages,
cost=cost, client_code=config.client_code,
)
Comment thread src/pacer_cli/cli.py Outdated
Comment on lines 1968 to 1970
# Explicit --court wins; otherwise apply the courts.csv scope (None=nationwide).
scoped_courts = list(court) if court else enabled_court_ids()

Comment thread src/pacer_cli/cli.py Outdated
Comment on lines +2267 to +2269
# Explicit --court wins; otherwise apply the courts.csv scope (None=nationwide).
scoped_courts = list(court) if court else enabled_court_ids()

Comment on lines +105 to +107
cfg = _load_config(client_code)
scoped = court or enabled_court_ids()
criteria = CaseSearchCriteria(caseNumberFull=case_number, caseTitle=title, courtId=scoped)
Comment on lines +132 to +135
cfg = _load_config(client_code)
scoped = court or enabled_court_ids()
case_criteria = CaseSearchCriteria(courtId=scoped) if scoped else None
criteria = PartySearchCriteria(
Comment on lines +68 to +75
def error_payload(operation: str, exc: Exception) -> dict[str, Any]:
"""Map a refusal to the same JSON shape the CLI's ``_deny`` emits."""
if isinstance(exc, GovernanceError):
return {"error": exc.error_key.upper(), "operation": operation, **exc.fields}
if isinstance(exc, ValueError): # unparseable policy.csv -> fail-closed
return {"error": "POLICY_INVALID", "operation": operation, "reason": str(exc)}
return {"error": "INTERNAL", "operation": operation, "reason": str(exc)}

Comment thread src/pacer_cli/security.py Outdated
Comment on lines +341 to +345
for line in log_file.read_text(encoding="utf-8").splitlines():
if not line.startswith(today) or "cost=$" not in line:
continue
if client_code is not None and f"client={client_code}" not in line:
continue
claude added 3 commits June 2, 2026 09:30
…ng; add fail-closed invariant tests

Review fixes (Copilot on PR #6):
- Empty court scope no longer fails open. When courts.csv disables every court,
  enabled_court_ids() returns [] which to_api_dict() dropped, silently widening
  the search to nationwide. Now both CLI searches and both MCP searches refuse
  (ScopeError / scope_empty, exit 3) instead.
- record_spend no longer swaps url/filepath into log_download — the audit line
  now reads "DOWNLOAD <operation> -> <saved path>".
- error_payload distinguishes policy.csv parse failures (new PolicyError ->
  POLICY_INVALID) from bad tool arguments (ValueError -> INVALID_ARGUMENT)
  instead of mislabeling everything POLICY_INVALID.
- spend_today(client_code=...) matches the client= field exactly on the
  " | "-delimited log, so M-1 no longer matches client=M-10.

Testing: the empty-scope bug passed every single-purpose test because none
asserted the underlying safety property. Add tests/test_failclosed_invariant.py
— a "network tripwire" that makes constructing any PACER client/downloader an
error, then drives every billable surface (3 CLI commands x 4 refusals, 4 MCP
tools x 4 refusals) and asserts the network is never reached, plus positive
controls proving the path isn't vacuously blocked. Verified it goes red when the
guard is removed. 236 passed, 3 skipped; pyflakes + audit clean.
Per review: instead of repeating the empty-scope guard at four call sites,
close it at the source. Add courts.resolve_court_scope() as the one open/off
switch — returns None (nationwide), a non-empty list (scoped), or raises
ScopeError when courts.csv disables every court. CLI searches route it through
a single resolve_scope() adapter to _deny; MCP searches let it propagate to
error_payload. The four inline `if scoped == []` guards are gone; the rule
lives in one place. The fail-closed invariant suite confirms every billable
surface still refuses an empty scope.

Also: ci.yml now triggers on [master, main] — it previously watched only `main`
while the repo's default branch is `master`, so CI never ran.

236 passed, 3 skipped; pyflakes + audit clean.
The ruff lint had never run (workflow triggered on main while the repo is on
master), so enabling CI surfaced ~356 pre-existing, repo-wide style findings
(dominated by Optional -> X|None). Keep ruff advisory (continue-on-error) so it
doesn't gate unrelated work; the dangling-code audit and tests stay blocking. A
dedicated ruff cleanup is tracked separately.
Presentation fixes, no behavior change:
- court-lookup.json was pretty-printed, which showed as +1485 lines even though
  the file shrank 1.67MB -> 36KB. Re-minify to a single line (matching the
  original layout) so the diff honestly reflects a shrink (+1/-1), not a build.
- Replace em-dash/en-dash/ellipsis I introduced in comments/docstrings/strings
  with ASCII, so the touched files stay ASCII-clean. The intentional alias-table
  arrows in cli.py help are left as-is. 236 passed, pyflakes + audit clean.
@johnzfitch
Copy link
Copy Markdown
Owner Author

@copilot search for optimizations and long chains

Copy link
Copy Markdown

Copilot AI commented Jun 2, 2026

@copilot search for optimizations and long chains

Implemented a focused optimization pass in fbe8090: replaced the long any([...]) chain allocation in pcl_parties with a tuple-based check, streamlined spend_today() cost parsing to avoid chained splits, and added a regression test ensuring client-code prefix collisions (e.g., M-1 vs M-10) are not miscounted. Verified with targeted tests plus full suite (237 passed, 3 skipped).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants