fix(ccr): protect headroom_retrieve outputs on the chat-completions path by maxsturmb · Pull Request #1176 · headroomlabs-ai/headroom

maxsturmb · 2026-06-19T22:19:44Z

Description

The Chat-Completions path does not protect headroom_retrieve tool outputs from
re-compression, so CCR retrieval is not actually reversible there.

The Responses path already guards this — it collects the call_ids of headroom_retrieve
function calls and skips compressing their outputs (proxy/handlers/openai.py, ~L766–837,
reason headroom_retrieve_output_protected). handle_openai_chat has no equivalent.

Effect: a model calls headroom_retrieve, the tool layer returns the expanded original as a
role:"tool" message, and on the next turn the proxy compresses that tool output again —
turning the retrieved original straight back into a <<ccr:...>> marker (or a lossy rewrite).
From the model's side, retrieval returns a marker instead of content: marker-in / marker-out.
compress_unit_with_router's existing _CCR_MARKER_RE handling does not catch it, because a
freshly-retrieved original contains no marker.

Closes #1077

Type of Change

Bug fix (non-breaking change that fixes an issue)
New feature (non-breaking change that adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update
Performance improvement
Code refactoring (no functional changes)

Changes Made

Mirror the Responses-path guard on the chat path, keyed by tool_call_id (survives reordering):

_headroom_retrieve_tool_call_ids(messages) — find the tool_call_ids of assistant
tool_calls named headroom_retrieve (or *__headroom_retrieve, covering MCP namespacing).
capture_headroom_retrieve_outputs(messages) — before compression, snapshot the pristine
content of every role:"tool" message whose tool_call_id belongs to such a call.
restore_headroom_retrieve_outputs(messages, protected) — after compression, restore those
messages in place (matched by tool_call_id). Returns the count restored.
Two-line wiring in handle_openai_chat: capture immediately before the CompressionDecision
runs, restore right after the INPUT_COMPRESSED event.

Non-retrieve tool outputs stay fully compressible — only headroom_retrieve results are protected.

Files: headroom/proxy/handlers/openai.py (+84), tests/test_ccr_chat_retrieve_protection.py (+88).

Testing

Unit tests pass (pytest)
Linting passes (ruff check .) — ruff not available in the verification env (see Additional Notes)
Type checking passes (mypy headroom) — mypy not available in the verification env (see Additional Notes)
New tests added for new functionality
Manual testing performed

Test Output

$ pytest tests/test_ccr_chat_retrieve_protection.py -q
collected 7 items
tests/test_ccr_chat_retrieve_protection.py .......                       [100%]
========================= 7 passed, 1 warning in 0.13s =========================

Real Behavior Proof

Environment: Headroom v0.26.0, branch fix/ccr-chat-retrieve-reentrancy, Python 3.13; live A/B against the running proxy with a recording mock upstream (this deployment serves only /v1/chat/completions).
Exact command / steps: craft a request whose history holds a headroom_retrieve assistant tool call plus its ~59.8 KB role:"tool" output, and an identical-content non-retrieve tool output as a control; send it through the proxy and diff the body forwarded upstream byte-for-byte against the original. Unit side: pytest tests/test_ccr_chat_retrieve_protection.py (7 passed) plus an 11/11 router round-trip.
Observed result: unpatched, the retrieve output was forwarded as 59787 -> 819 bytes (mangled); patched, it is forwarded 59787 -> 59787 bytes byte-identical with timestamps intact, while the identical-content non-retrieve control still compresses to 819 bytes (protection is scoped, not blanket).
Not tested: streaming responses (the affected path operates on request-side message history, not the response stream); and a host agent's own context compression re-compressing the retrieved output in very long sessions (out of scope for this proxy fix).

Review Readiness

I have performed a self-review
This PR is ready for human review

Checklist

My code follows the project's style guidelines
I have performed a self-review of my code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have updated the CHANGELOG.md if applicable

Additional Notes

The fix deliberately mirrors the existing, accepted Responses-path protection rather than
introducing a new mechanism, to keep behaviour and reasoning consistent across both API formats.
ruff / mypy were not present in the verification environment, so those two boxes are left
unchecked rather than claimed; the change is small, type-annotated, and follows the surrounding
style of openai.py. Happy to paste linter/type-check output if a maintainer points me at the
expected toolchain.
No CHANGELOG.md / docs entry added — flagging as N/A; will add on request if the project expects one.

The Responses path tracks retrieve call_ids and skips compressing their outputs (proxy/handlers/openai.py ~L766-837). The Chat-Completions path had no equivalent, so a retrieved original — the expanded content the model explicitly asked for — got re-compressed back into a <<ccr:...>> marker on the next turn (marker-in / marker-out: retrieval never returned content). Capture pristine headroom_retrieve outputs (keyed by tool_call_id) before compression and restore them after, mirroring the Responses-path guard. Non-retrieve tool outputs stay compressible. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

github-actions · 2026-06-19T22:19:56Z

PR governance

This PR follows the template and is marked ready for human review.

JerrettDavis approved these changes Jun 22, 2026

View reviewed changes

style: format CCR chat retrieve protection test

5516e40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(ccr): protect headroom_retrieve outputs on the chat-completions path#1176

fix(ccr): protect headroom_retrieve outputs on the chat-completions path#1176
maxsturmb wants to merge 2 commits into
headroomlabs-ai:mainfrom
maxsturmb:fix/ccr-chat-retrieve-reentrancy

maxsturmb commented Jun 19, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 19, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

maxsturmb commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of Change

Changes Made

Testing

Test Output

Real Behavior Proof

Review Readiness

Checklist

Additional Notes

Uh oh!

github-actions Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR governance

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

maxsturmb commented Jun 19, 2026 •

edited

Loading

github-actions Bot commented Jun 19, 2026 •

edited

Loading