Skip to content

Compaction lifecycle: harden two non-blocking edges from #63 (stuck active state; future double-rotation) #66

Description

@lsaether

Follow-up to #63 (merged in #65). Two known, non-blocking edges deferred at merge time, plus the broader durability direction. Neither is reachable as a correctness bug today; this issue tracks them so they don't live only in PR review.

1. started with no done leaves compaction.active = true indefinitely

amux/context_compaction_started flips compaction_state.active = true; only a matching amux/context_compaction_done clears it. If Hermes crashes or is killed mid-compaction (or the done stderr line is dropped under flood — see lossy-pump note in src/agent/process.rs), active stays true until the next done or an amux restart.

  • Impact: cosmetic — a stuck "compacting…" indicator in session/attach snapshots and /debug/sessions. No count or segment corruption.
  • Self-heals: next done, or restart (hydration rebuilds from the persisted log).
  • Possible fix: clear active (a) when a turn settles, (b) on agent death/AgentDied, or (c) after a bounded timeout. Option (b) is probably the cleanest signal — a compaction can't still be in flight once the child is gone.

2. Future double-rotation when Hermes ships _meta.hermes AND stderr is enabled

Today only the stderr scraper rotates segments for Hermes compaction (Hermes doesn't emit _meta.hermes yet, so detect_segment_signal_from_agent_notification is dormant). When Hermes starts shipping structured _meta.hermes.sessionProvenance, both detectors could fire for the same compaction.

  • No double-count: compaction_count only increments on the stderr done; the stdout _meta.hermes path rotates segments but never touches the count. Verified intentional (issue Capture Hermes stderr compaction signals and expose AMUX events #63 acceptance criterion).
  • Risk: a spurious extra segment (stderr rotates to seg-N+1 with the old hermes session id in provenance, then the stdout metadata observes a new hermes id and rotates again to seg-N+2).
  • Direction: when structured signals land, gate the stderr scraper off (or make it a labeled fallback). The source: "hermes_stderr" field on amux/context_compaction_* already exists to distinguish the two paths. The robust long-term fix is Hermes emitting a machine-readable compaction/head-change signal so we stop scraping logs entirely.

3. (Context) stderr/stdout ordering is racy by construction

Not action-required, recorded for completeness. The done line (stderr) and post-compaction frames (stdout) feed one mpsc from two tasks, so a couple of frames adjacent to the boundary can land in the neighbor segment. Bounded, no frame loss/corruption, and already de-risked by the cross-segment turn-bookend carry logic. The structured-metadata path (item 2) rotates inline and exact, which is the real resolution.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions