feat(copilot): Enterprise test kit — doctor + live-proxy harness + outbound capture by chopratejas · Pull Request #1136 · chopratejas/headroom

chopratejas · 2026-06-18T17:27:14Z

Description

We cannot validate Headroom's GitHub Copilot support beyond gpt-4o without an entitled seat (Copilot Business requires 10+ employees). Investigation traced the recurring "only a subset of models" reports to an upstream GitHub entitlement/policy gate, not Headroom: the /models catalog and inference are both keyed to the token's entitlement, and premium models return 400 "The requested model is not supported" even when hitting api.githubcopilot.com directly with no Headroom in the path (verified locally). Headroom does not filter the catalog — it forwards the token and GitHub decides.

This PR adds a self-contained, secret-free test kit so a community member with a Copilot Business/Enterprise seat can run two commands and tell us, unambiguously, whether premium models flow through Headroom — and if not, whether it is a Headroom bug or their org policy.

Related: #488, #635, #644, #972, #1039 — this is a diagnostic kit to confirm the root cause, not a fix for them.

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Performance improvement
Code refactoring (no functional changes)

Changes Made

tools/copilot-test/copilot_doctor.py — read-only diagnostic: where the credential lives (keychain/file/env, cross-platform), the resolved API host, the exact outbound request Headroom forwards, the /models catalog, inference on /chat/completions + /responses (the [BUG] headroom wrap copilot --subscription cannot reach gpt-5.x — forces /chat/completions and blocks --wire-api override #644/fix(copilot): use responses API for subscription reasoning models #647 wire-API split), and a pass-through contract check. Output is secret-free — tokens are reduced to their non-secret type prefix + length only.
tools/copilot-test/enterprise_proxy_test.py — starts a real Headroom proxy at a given host and runs premium models through it; prints a PASS/FAIL matrix. Host-agnostic (GITHUB_COPILOT_API_URL).
tools/copilot-test/ENTERPRISE-COPILOT-TEST.md — provisioning + run + decision-matrix runbook (Copilot Business).
headroom/copilot_auth.py — optional, env-gated outbound capture (HEADROOM_COPILOT_DEBUG_OUTBOUND). No-op unless enabled. Records only host + URL + fixed credential labels (scheme + token type prefix); no token bytes and no request headers are written or logged. (Integration-id / editor-version are surfaced by the doctor's reconstruction.)
tests/test_copilot_auth.py — adds two tests: the capture never emits token bytes, and it is a no-op when the env var is unset.

Testing

Unit tests pass (pytest) — touched Copilot paths
Linting passes (ruff check)
Type checking passes (mypy headroom/copilot_auth.py)
New tests added for new functionality
Manual testing performed

Test Output

$ ruff check headroom/copilot_auth.py tools/copilot-test/ tests/test_copilot_auth.py
All checks passed!

$ mypy headroom/copilot_auth.py
Success: no issues found in 1 source file

$ pytest tests/test_copilot_auth.py tests/test_proxy_copilot_auth_hooks.py -q
58 passed in 1.64s

Real Behavior Proof

Environment: macOS, GitHub Copilot CLI logged in, non-entitled seat (gpt-4o only), host api.githubcopilot.com.
Exact command / steps: .venv/bin/python tools/copilot-test/copilot_doctor.py then .venv/bin/python tools/copilot-test/enterprise_proxy_test.py (run from repo root).
Observed result: gpt-4o returns 200 through the proxy; gpt-5.5 and claude-sonnet-4.6 return 400 "not supported" on both /chat/completions and /responses; the same 400s reproduce hitting api.githubcopilot.com directly with no Headroom in the path → upstream entitlement, not a Headroom bug. Outbound capture confirmed host=api.githubcopilot.com integration-id=vscode-chat and the credential is reduced to Bearer / gho_*** (no token bytes).
Not tested: an actual entitled Copilot Business/Enterprise seat (this PR exists to recruit a tester for exactly that); Windows/Linux credential discovery in this run.

Review Readiness

I have performed a self-review
This PR is ready for human review

Checklist

My code follows the project's style guidelines
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation (the runbook)
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have updated the CHANGELOG.md if applicable

Screenshots (if applicable)

N/A — CLI/diagnostic output is shown above.

Additional Notes

CodeQL findings addressed: the diagnostic now emits only a credential's non-secret type prefix (gho_/tid_) + length via constant literals — never a token slice — so no clear-text token data reaches any file or log sink. The doctor's host-type check uses parsed urlparse(...).hostname + endswith(".ghe.com") instead of a substring match.
CHANGELOG not updated: dev tooling + an env-gated debug hook, no user-facing runtime behavior change. Flagging for maintainer preference.
📣 Calling for testers (Copilot Business/Enterprise): from a checkout of this branch, on the machine where Copilot is logged in:
```
GITHUB_COPILOT_API_URL=https://api.business.githubcopilot.com .venv/bin/python tools/copilot-test/copilot_doctor.py
GITHUB_COPILOT_API_URL=https://api.business.githubcopilot.com .venv/bin/python tools/copilot-test/enterprise_proxy_test.py
```
Paste the output back (no secrets). Decision matrix is in the runbook: premium ✅ native but ❌ through-proxy = Headroom bug; ❌ in both = org policy/entitlement; 403 = SSO authorization needed.

Lets a community member with a Copilot Business/Enterprise seat verify that Headroom routes premium models correctly, and cleanly tell a Headroom bug apart from a GitHub entitlement/policy limit (catalog shows models the seat may not be entitled to RUN; the 400 "model not supported" is upstream, not Headroom). - tools/copilot-test/copilot_doctor.py — read-only diagnostic: where the credential lives, resolved host, the exact outbound request, /models catalog, inference on chat + /responses, and a pass-through contract check. Secret-free output (tokens shown as prefix/kind only). - tools/copilot-test/enterprise_proxy_test.py — starts a real proxy at a given host and runs premium models THROUGH Headroom; prints a PASS/FAIL matrix. - tools/copilot-test/ENTERPRISE-COPILOT-TEST.md — provisioning + run + decision matrix runbook (Copilot Business). - headroom/copilot_auth.py — optional env-gated outbound capture (HEADROOM_COPILOT_DEBUG_OUTBOUND); no-op unless enabled. Logs host + identity headers + token KIND only, never the token.

github-actions · 2026-06-18T17:27:32Z

PR governance

This PR follows the template and is marked ready for human review.

- copilot_auth.py: capture hook now records only fixed credential labels (scheme + type prefix) — no token bytes reach the file or log sink (CodeQL clear-text storage/logging of sensitive information). - copilot_doctor.py: redact() emits only the non-secret token type prefix + length (never a token slice); host-type check parses the URL hostname and uses endswith(".ghe.com") instead of a substring match (CodeQL incomplete URL substring sanitization). - tests: add capture-never-leaks-token and disabled-by-default coverage.

CodeQL taints the whole headers dict (it holds the auth token), so any header- or token-env-derived value reaching a write/log was flagged even though it was redacted. The capture now records only host + URL + constant credential labels; the doctor's env list shows presence only; integration-id / editor-version are surfaced by the doctor's reconstruction instead.

…-text logging)

+head("[1] Environment variables")
+for v in copilot_auth._API_TOKEN_ENV_VARS + copilot_auth._COPILOT_OAUTH_TOKEN_ENV_VARS:
+    # presence by KEY only — never read the value of a token env var
+    print(f"    {v:38s} {'SET' if v in os.environ else 'unset'}")


orty · 2026-06-19T09:46:59Z

Windows Validation Report — PR #1136

Status: ✅ COMPLETE & VERIFIED

Enterprise Copilot subscription feature is fully tested and production-ready on Windows.

Test Summary

All three validation flows completed successfully:

Enterprise API Connectivity ✅
- Successfully connects to api.business.githubcopilot.com
- Authentication working correctly via Windows Credential Manager
- All models tested (gpt-4o, gpt-5.5, claude-sonnet-4.6) respond with HTTP 200
Windows Credential Storage ✅
- Credential stored in Windows Credential Manager with correct schema
- Target format: copilot-cli/https://github.com/<org>:<username>
- Auto-discovery functional and ready for use
CLI Integration ✅
- headroom wrap copilot --subscription flag properly integrated
- Wire-API routing working correctly
- Enterprise endpoint targeting functional

Model Testing Results

Through-proxy inference:
  ✅ gpt-4o               200 via=chat
  ✅ gpt-5.5              200 via=responses
  ✅ claude-sonnet-4.6    200 via=chat

Key Findings

Wire-API auto-selection working: Reasoning models (gpt-5.5) correctly route to /responses endpoint; standard models use /chat
Token compression confirmed: End-to-end token forwarding and compression working across all tested model types
Enterprise tier validated: All models available on enterprise seat responded correctly
Cross-platform coverage: Windows (fully tested), macOS (previously validated), Linux (previously validated)

Recommendation

Ready for merge and release. Enterprise users can now use headroom wrap copilot --subscription without manual configuration.

Test Environment:

OS: Windows 10.0.26200.8655
Copilot CLI: 1.0.63
Branch: feat/copilot-enterprise-test-kit

JerrettDavis

This PR is not merge-ready because CI is currently red: CodeQL=FAILURE. Please fix or rerun the failing checks after updating from current main.

chopratejas · 2026-06-19T16:55:10Z

Fantastic @orty - so in essence - this works on your Windows - with your Enterprise Github Copilot License?
Can we confirm that?
Also - do you see token savings?

github-actions Bot added the status: needs author action Pull request body or readiness checklist still needs author updates label Jun 18, 2026

github-advanced-security AI found potential problems Jun 18, 2026

View reviewed changes

style(copilot): ruff format the new capture tests

5f9b8f4

github-actions Bot added status: ready for review Pull request body is complete and the author marked it ready for human review and removed status: ci failing Required or reported CI checks are failing labels Jun 18, 2026

github-advanced-security AI found potential problems Jun 18, 2026

View reviewed changes

Comment thread headroom/copilot_auth.py Fixed

github-advanced-security AI found potential problems Jun 18, 2026

View reviewed changes

Comment thread tools/copilot-test/copilot_doctor.py Fixed

github-actions Bot added status: ci failing Required or reported CI checks are failing and removed status: ready for review Pull request body is complete and the author marked it ready for human review labels Jun 18, 2026

fix(copilot): check token env vars by key presence only (CodeQL clear…

c12d47a

…-text logging)

github-actions Bot added status: ready for review Pull request body is complete and the author marked it ready for human review and removed status: ci failing Required or reported CI checks are failing labels Jun 18, 2026

github-advanced-security AI found potential problems Jun 18, 2026

View reviewed changes

JerrettDavis requested changes Jun 19, 2026

View reviewed changes

github-actions Bot added status: ci failing Required or reported CI checks are failing and removed status: ready for review Pull request body is complete and the author marked it ready for human review labels Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(copilot): Enterprise test kit — doctor + live-proxy harness + outbound capture#1136

feat(copilot): Enterprise test kit — doctor + live-proxy harness + outbound capture#1136
chopratejas wants to merge 5 commits into
mainfrom
feat/copilot-enterprise-test-kit

chopratejas commented Jun 18, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 18, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

orty commented Jun 19, 2026 •

edited

Loading

Uh oh!

JerrettDavis left a comment

Uh oh!

chopratejas commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

chopratejas commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Changes Made

Testing

Test Output

Real Behavior Proof

Review Readiness

Checklist

Screenshots (if applicable)

Additional Notes

Uh oh!

github-actions Bot commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR governance

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

orty commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Windows Validation Report — PR #1136

Test Summary

Model Testing Results

Key Findings

Recommendation

Uh oh!

JerrettDavis left a comment

Choose a reason for hiding this comment

Uh oh!

chopratejas commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

chopratejas commented Jun 18, 2026 •

edited

Loading

github-actions Bot commented Jun 18, 2026 •

edited

Loading

orty commented Jun 19, 2026 •

edited

Loading