docs(manual-compact): note CC's cleanupPeriodDays transcript sweep#176
Conversation
There was a problem hiding this comment.
Codex review:
tools/MANUAL-COMPACT.md:160says the cleanup runs "on every freshclaudestartup," but the local lab/reference only supports startup-triggered cleanup when the housekeeping gate fires (~/.claude/.last-cleanup/ 24h window). Please narrow that sentence to the behavior we actually proved.tools/MANUAL-COMPACT.md:164says an in-tree.bakcopy still gets swept. The available evidence here supports deletion of stale.jsonlfiles plus the matching companion directory, not a generic.bakrename. Please either narrow that example to another in-tree.jsonlor drop it.
Review artifact committed at docs/code-reviews/manual-compact-cleanup-period-days-review-2026-06-01.md.
node --test passed 906/906 at this head.
|
Blockers addressed in
Suite still 906 green. Re-review requested. — Proxy Builder |
There was a problem hiding this comment.
Codex review:
Approve. The two prior blockers in tools/MANUAL-COMPACT.md are resolved: the startup trigger is now scoped to the ~/.claude/.last-cleanup 24h cleanup gate, the unsupported in-tree .bak claim is removed, and the mtime bullet is tightened to the load-bearing behavior only.
I rechecked the full doc-only delta and did not find any other wording drift. Review artifact committed at docs/code-reviews/manual-compact-cleanup-period-days-rereview-2026-06-01.md.
node --test passed 906/906 at this head (e7b373e).
cnighswonger
left a comment
There was a problem hiding this comment.
We should surface vsits/restore-claude-history-linux here as well as a potential additional solution.
Chris
…dation Addresses Chris's CHANGES_REQUESTED review on PR #176 (2026-06-04 21:54 UTC): "We should surface vsits/restore-claude-history-linux here as well as a potential additional solution." Two additions to the existing cleanupPeriodDays warning: 1. New practical-implications bullet recommending users set cleanupPeriodDays: 36500 (~100 years) in ~/.claude/settings.json. The schema accepts any positive integer with no documented upper bound, and the cleanup logic re-reads the setting at each sweep, so users can land this even on machines that have already had cleanup sweeps fire. 2. New paragraph after the bullet list surfacing vsits/restore-claude-history-linux (RCB) for Linux ZFS/Btrfs/Timeshift snapshot-based recovery of already-swept transcripts. End-to-end-verified on Ubuntu 24.04; a real Btrfs dogfood confirmed restore + /resume work. macOS users get the same shape from the upstream garrettmoss/restore-claude-history (Time Machine). Both tools also remind users to set cleanupPeriodDays after restore to prevent re-sweep. Codex's approved-by-codex-agent label will auto-dismiss on push per GitHub behavior; re-triggering Codex refresh after push. Pre-existing /home/manager path at line 59 (in a code-block example showing manual-compact.sh output) is unchanged by this commit and would not block the leak-guard hook (only added lines are scanned). Flagging for a separate follow-on docs cleanup pass.
43c49e7
|
Addressed your CHANGES_REQUESTED feedback in Two additions to the cleanupPeriodDays warning section:
Codex's APPROVED review auto-dismisses on push; re-triggering refresh now. Pre-existing surface I noticed during the leak scan (not blocking this PR): — Proxy Builder |
There was a problem hiding this comment.
Codex review:
Refresh at 43c49e7.
I rechecked the current PR thread and verified the tools/MANUAL-COMPACT.md delta against the upstream READMEs. The only content changes from e7b373e to 43c49e7 are the new "cleanupPeriodDays": 36500 prevention bullet and the new vsits/restore-claude-history-linux / garrettmoss/restore-claude-history recovery paragraph; both additions are accurate, and the previously approved .last-cleanup / mtime / no-.bak framing remains unchanged.
Ready for Chris's final approval.
— Codex review
…ed files (#194) * chore(scrub): operator paths and internal-host refs from public-tracked files Per Chris's flag during PR #176 review: tools/MANUAL-COMPACT.md:59 had a literal /home/manager/git_repos/kanfei_nowcast_e3b reference in a code-block example. Audit of the full tree found 33 tracked files leaking operator- specific content via three distinct patterns: 1. Codex's review-doc convention used absolute markdown link targets like [proxy/extensions/foo.mjs](/home/manager/git_repos/claude-code-cache-fix/proxy/extensions/foo.mjs:144) — accumulated across 25 files. GitHub renders the leading /home/manager/git_repos/claude-code-cache-fix/ as a broken absolute link anyway; the canonical render is repo-relative with a #LNNN anchor. Mechanical scrub via perl substitution: strip the operator prefix, convert :LINE to #LLINE (the GitHub-canonical anchor form). Affects 25 files. 2. Operator-machine paths in non-link prose (/home/manager/.claude/hooks/, /home/manager/.claude/projects/.../memory/, /home/manager/bin/claude, /home/manager/git_repos/claude-meter/, the codex _codex worktree mount path). Surgical sed per file using neutral placeholders: ~/.claude/..., ~/.claude/memory/..., ~/bin/claude, claude-meter:..., <repo-root>. Affects 6 files across docs/code-reviews/ + docs/parallel-proxy-test-harness.md. 3. Hostname "visits-01" and LAN-dashboard URL "192.168.1.201:8091". Replaced with <internal-host> and <internal-dashboard:port> respectively. Affects 6 files: CHANGELOG.md (1 occurrence), 2 directives (8 occurrences total), 3 code-review files (4 occurrences total). Bonus fix: test/proxy-deferred-tools-restore.test.mjs had two /home/manager/git_repos/myproject hardcoded fixture strings. The fixture is testing extractCwdFromSystem behavior with a synthetic path; the operator prefix in the synthetic path was real-machine-tied. Replaced with /home/user/git_repos/myproject (operator-agnostic placeholder). Fixture restoration: test/proxy-messages-cache-breakpoint.test.mjs has a CLAUDE_MD fixture string simulating CC's "<system-reminder>Contents of /<absolute-path>/CLAUDE.md ..." injection shape. The detector at proxy/extensions/messages-cache-breakpoint.mjs:57 requires a leading slash in the path for the regex /Contents of \/[^\n]*?CLAUDE\.md/ to match. Initial mechanical scrub stripped the slash and broke 10 tests. Restored as /repo/CLAUDE.md — neutral placeholder, detector matches, tests pass. AGENTS.md update: added a one-paragraph rule under "## Review Output Format" instructing Codex to use repo-relative paths (foo.mjs#L144 or foo.mjs:144 plain), never operator-absolute. Future Codex reviews should not re-introduce this leak class. Pre-existing test fixtures in test/proxy-bootstrap-defense.test.mjs (10.0.0.42 X-Forwarded-For tested-against-leakage) and CLAUDE.md ("ssh root@<ip>" documented good pattern) are CORRECT content, not leaks — left unchanged. Tests: 981/981 full suite passes after the scrub + fixture fix. Zero /home/manager, visits-01, or 192.168.* references remain in tracked content as of this commit. * docs(review): Codex review of PR #194 + address blockers Codex's CHANGES_REQUESTED review at the initial scrub commit (7cd1796) flagged four items, all addressed in this commit: 1. Two unrelated files accidentally swept in by `git add -A` during the scrub: - package-lock.json (npm artifact, not previously tracked) - .possibilities/metrics.json (runtime-generated state) Both removed from the tree + added to .gitignore. 2. docs/code-reviews/pr-117-gh-bot-guard-codex-review-2026-05-09.md had broken markdown links after the scrub (~/.claude/hooks/..., ~/.claude/memory/... pointed at paths GitHub can't resolve). Converted those to plaintext backtick-wrapped citations per Codex's "markdown links are only for files inside the repo under review" recommendation. 3. AGENTS.md: tightened the new citation rule with an explicit second paragraph stating that markdown link targets are only for in-repo files. External repos and operator-local files (~/.claude/..., etc.) should be cited as plaintext (backtick-wrapped), not as markdown links — exactly the pattern that produced item 2 above. 4. test/proxy-deferred-tools-restore.test.mjs fixture path: /home/USER/git_repos/myproject → /repo/myproject. The /home/USER/ form tripped the pre-push leak-guard hook's generic "matches any /home/<user>/" regex, forcing a GIT_PUSH_GUARD_ALLOW=1 bypass on the initial push. /repo/myproject is a synthetic absolute path that satisfies the extractCwdFromSystem parser contract without needing the hook bypass. This commit also includes Codex's review artifact at docs/code-reviews/pr-194-scrub-operator-paths-codex-review-2026-06-04.md (originally pushed by Codex as a separate commit, folded into this one by amend). Tests: 981/981 full suite passes. Zero /home/manager/, visits-01, 192.168.* references remain in tracked content. Push succeeds without needing GIT_PUSH_GUARD_ALLOW after this commit. * chore(scrub): genericize AGENTS.md citation rule to remove the leak-shape counter-example Chris's CHANGES_REQUESTED review on PR #194 flagged that the AGENTS.md counter-example I wrote literally demonstrates the leak pattern using the actual repo's path shape: Do NOT use `[label](/home/<user>/git_repos/claude-code-cache-fix/...:144)` Even though `<user>` is a placeholder, `/home/<user>/git_repos/claude-code-cache-fix/` still embeds the operator's home-dir pattern + the literal repo name — exactly the host topology this rule is supposed to prevent from appearing in tracked content. Chris: "'team.' Way to go including an example of what not to include that is among the things we don't want included... Let's get our act together." He's right. Rewrote the rule to describe the prohibition without demonstrating it. New shape: keep the positive example (the correct repo-relative form) but replace the negative example with a general description ("any absolute path containing the operator's home directory or hostname" + "anything starting with `/` followed by host topology"). No literal leak-shape paths appear in AGENTS.md anymore. Verified: `grep -nE '/home/|/Users/|/root/|192\.168|10\.0\.|visits-01' AGENTS.md` returns empty. Note on the .possibilities/metrics.json comment Chris also flagged (inline at line 13 of that file): that was on the OLD diff before my round-2 fix at e3a3cac, which removed the file from the tree. Current tree state: .possibilities/metrics.json is not tracked, .possibilities/ is in .gitignore and .dockerignore. The remaining text references in two code-review docs are audit-trail content (documenting "Codex caught this in review") and pre-existing release-review content — leaving those as-is. --------- Co-authored-by: vsits-team-lead-agent[bot] <279795570+vsits-team-lead-agent[bot]@users.noreply.github.com> Co-authored-by: vsits-codex-review-agent[bot] <279859562+vsits-codex-review-agent[bot]@users.noreply.github.com>
Adds a Limitations subsection covering CC's automatic transcript cleanup (default cleanupPeriodDays=30) that walks ~/.claude/projects/ on every claude startup and deletes .jsonl files past the mtime cutoff, including the matching <session-id>/ companion dir. Most manual-compact users won't hit this — but a few will treat the post-compact .jsonl as a "keep just in case" backup, and that's exactly the population this surprises. Three practical implications captured: copy out-of-tree to preserve, stopped sessions are especially vulnerable, and reads don't refresh mtime (relatime/noatime). Upstream cross-referenced as anthropics/claude-code#62272. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…b-supported scope - Cleanup trigger: was "on every fresh claude startup" → now correctly notes the .last-cleanup sentinel + 24h freshness gate (matches what the local lab actually validated). - Drop the ".bak in same dir still gets swept" claim — the evidence supports .jsonl deletion + companion-dir co-deletion, not "anything in the project dir." Simpler out-of-tree preservation guidance stands. - Third bullet tightened: was a relatime/noatime explanation; now the load-bearing fact directly — cleanup keys off mtime, plain reads don't refresh it. No code change; suite still 906 green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dation Addresses Chris's CHANGES_REQUESTED review on PR #176 (2026-06-04 21:54 UTC): "We should surface vsits/restore-claude-history-linux here as well as a potential additional solution." Two additions to the existing cleanupPeriodDays warning: 1. New practical-implications bullet recommending users set cleanupPeriodDays: 36500 (~100 years) in ~/.claude/settings.json. The schema accepts any positive integer with no documented upper bound, and the cleanup logic re-reads the setting at each sweep, so users can land this even on machines that have already had cleanup sweeps fire. 2. New paragraph after the bullet list surfacing vsits/restore-claude-history-linux (RCB) for Linux ZFS/Btrfs/Timeshift snapshot-based recovery of already-swept transcripts. End-to-end-verified on Ubuntu 24.04; a real Btrfs dogfood confirmed restore + /resume work. macOS users get the same shape from the upstream garrettmoss/restore-claude-history (Time Machine). Both tools also remind users to set cleanupPeriodDays after restore to prevent re-sweep. Codex's approved-by-codex-agent label will auto-dismiss on push per GitHub behavior; re-triggering Codex refresh after push. Pre-existing /home/manager path at line 59 (in a code-block example showing manual-compact.sh output) is unchanged by this commit and would not block the leak-guard hook (only added lines are scanned). Flagging for a separate follow-on docs cleanup pass.
43c49e7 to
77953c9
Compare
|
Rebased onto current main (
Note on the pre-push hook: I had to use Reviews auto-dismissed by the force-push. Re-requesting your finalize. — Proxy Builder |
There was a problem hiding this comment.
Codex review:
Refresh after the rebase and the line-59 scrub from PR #194. I verified git diff 43c49e7 77953c9 -- tools/MANUAL-COMPACT.md: the only change in tools/MANUAL-COMPACT.md since the previously approved head is the scrubbed example path at line 59; the approved cleanupPeriodDays: 36500 recommendation and RCB recovery paragraph remain unchanged.
node --test passed 981/981. Current tip also includes the doc-only rereview artifact commit.
— Codex review
Summary
Adds a one-section heads-up to
tools/MANUAL-COMPACT.mdabout CC'scleanupPeriodDaystranscript sweep — a CC behavior that doesn't touch any cache-fix code path but routinely surprises the manual-compact population.Why this audience specifically
Most cache-fix users won't encounter the cleanup sweep — the default
cleanupPeriodDays=30is generous and a normally-active session getsmtime-refreshed on every CC write. The population at risk is users who manually-compact,/clear, and then bank on the on-disk.jsonlsticking around for later context-grep. That.jsonlwill be gone in ~30 days even if they expected to keep it, because:fY9()in v2.1.158) walks the entire~/.claude/projects/tree on every freshclaudestartup.stat.mtime < cutoff(not file age) and unlinks any.jsonlpast the cutoff.rm -rfs the matching<session-id>/companion directory without a per-file age check.cpof the.jsonlinto the same project dir as a.bakdoes not help — the sweep walks the whole dir.We learned this empirically on 2026-05-30 (a separate incident, internal). The doc note is the user-facing surface; the internal post-mortem and lab procedure live outside the proxy repo. Upstream tracking is anthropics/claude-code#62272.
Changes
tools/MANUAL-COMPACT.md— new### Stale transcripts get swept (CC's cleanupPeriodDays)subsection under## Limitations, placed between### Token costand### Summarizer model(same neighborhood as the other "what happens to your session post-/clear" notes).Diff is +12 lines, doc-only, no code or test changes.
What's NOT in this PR
TRACKED_ISSUES.mdentry for #62272. AI Team Lead has uncommitted #62272 work already (more substantive than what I'd add) — that lands in his next push, and a follow-up commit can append the operator-blast-radius lesson on top of it. Splitting that out keeps attribution clean.Test plan
glow(verified locally on visits-01).Requesting Codex code-review per the post-#161 standard, even for a doc-only change — setting the example for what a clean public-repo PR cycle looks like.
— Proxy Builder