fix(rooms): clear stranded compaction active state at turn settlement#67
Merged
Conversation
`compaction_state.active` is set by a Hermes `context compression started` stderr line and cleared by the matching `done`. If the `done` never arrives — the compaction failed, or its line was dropped by the lossy stderr pump — the flag stayed `true` forever, so /debug/sessions and session/attach snapshots reported a perpetual "compacting" state to late joiners (issue #66, item 1). Hermes compacts within a prompt turn, so by the time that turn's response settles any compaction it triggered has completed or been abandoned. Clear the stranded transient flag at turn settlement — the same signal a live client uses to clear its own "compacting" affordance (amux/turn_complete). Only the transient active/pending fields reset; compaction_count and the last_* history are durable and untouched. No frame is emitted (live clients already clear on amux/turn_complete; this only realigns snapshot state for future attachers). Tests: stranded `started` with no `done` clears on turn settlement without fabricating a count; a completed compaction is unaffected. Refs #66.
…e restart The turn-settlement clear added in this branch was in-memory only and emitted no frame, so it didn't survive restart: hydration rebuilds compaction_state purely from persisted frames, and a persisted context_compaction_started with no matching _done resurrected active=true after a restart — defeating the snapshot fix for persistent rooms. Teach hydration the same turn-settlement bound the live path uses: when rebuild_compaction_from_frame replays a persisted amux/turn_complete and compaction is still marked active, clear the transient active/pending fields (count + history untouched). amux/turn_complete is broadcast and therefore persisted on every prompt-turn settlement, so no new lifecycle marker frame is needed. Test: started -> turn_complete -> restart asserts restored compaction.active == false (verified it fails without the hydration arm). Refs #66.
clippy::collapsible_match (newer on CI's toolchain than my local one) flagged the `if compaction_state.active` nested inside the `amux/turn_complete` match arm. Rewrite as a guarded arm `"amux/turn_complete" if compaction_state.active =>`. Behavior is identical — a turn_complete with no active compaction falls through to the no-op arm. Verified against rustc/clippy 1.96.0 locally.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Addresses item 1 of #66.
Problem
compaction_state.activeis set by a Hermescontext compression startedstderr line and cleared by the matchingdone. If thedonenever arrives — the compaction failed, or its line was dropped by the lossy stderr pump (STDERR_CAPACITY) — the flag stayedtrueindefinitely, so/debug/sessionsandsession/attachsnapshots reported a perpetual "compacting" state to late joiners.Fix
Hermes compacts within a prompt turn, so by the time that turn's response settles, any compaction it triggered has either completed (cleared by
done) or been abandoned. Turn settlement is the bound — the same signal a live client uses to clear its own "compacting" affordance (amux/turn_complete).active/pending_hermes_session_idat turn settlement.compaction_countandlast_*history are durable and untouched. No frame emitted — live clients already clear onamux/turn_complete.compaction_statefrom persisted frames, and a persistedcontext_compaction_startedwith nodoneresurrectedactive = true. Fixed by applying the same bound during hydration: a persistedamux/turn_completeclears a still-active compaction.amux/turn_completeis broadcast (and persisted) on every prompt-turn settlement, so no new marker frame is needed.Tests
stranded_compaction_active_clears_at_turn_settlement— live clear on turn settlement, no fabricated count.completed_compaction_is_unaffected_by_turn_settlement_clear— happy-path guard.hydration_clears_stranded_compaction_via_persisted_turn_complete—started→turn_complete→ restart asserts restoredactive == false(verified it fails without the hydration arm).cargo test89 lib + 82 integration green;cargo clippy -D warningsandcargo fmt --checkclean.Scope note
Closes out the actionable part of #66. The other two items are accepted/deferred — rationale in this comment on #66:
_meta.hermes) — contingent on a structured upstream signal not being pursued while we support today's Hermes.#66 stays open until this merges; I'll close it then.
Refs #66.