Skip to content

ARIA: evidence-linking for working-memory tracking#354

Open
holoduke wants to merge 2 commits into
mainfrom
aria/evidence-linked-tracking
Open

ARIA: evidence-linking for working-memory tracking#354
holoduke wants to merge 2 commits into
mainfrom
aria/evidence-linked-tracking

Conversation

@holoduke
Copy link
Copy Markdown
Owner

@holoduke holoduke commented Jun 2, 2026

Summary

Adds optional evidence fields (evidenceObsId, evidenceSender, evidenceTs) to shortTermTracking items so the brain must point at WHICH message proved a status change when marking items resolved/afgerond.

Display format: X afgerond (evidence: msg <id> from <sender> at <ts>)

Why

Today's capability-demo failure (2026-06-02, 11:01–11:41) had Ilse send a one-word "Oke!" about camping-kleden which the brain mis-attributed as acknowledgment of an agenda-request that ARIA never actually answered. Gillis corrected at 11:44.

Root cause: tracking items had no link back to the observation that supposedly resolved them. The brain could mark agenda-request afgerond from any nearby positive-feeling string, regardless of whether that string was actually about the agenda.

Aligns with the existing feedback_working_memory_honesty rule — never mark tasks "afgerond" without direct evidence.

What changed

  • backend/memory/types.ts — new ShortTermTrackingItem union: string | { text, evidenceObsId?, evidenceSender?, evidenceTs? }. Plain strings remain valid for in-flight notes; structured form is required for resolved items.
  • backend/memory/working-memory.tstrackingItemText() helper renders structured items with an inline evidence suffix for prompt/UI display.
  • backend/brain-prompt.ts — new EVIDENCE-LINKING guideline tells the model: if you cannot point at a specific message that completes the item, do NOT mark it resolved.
  • backend/system-prompt.ts, backend/memory/reconstruction.ts — use the helper to render tracking items.
  • frontend/app/types/aria.ts + pages/overview.vue + pages/memory.vue — UI handles both forms via the same helper.

Test plan

  • npx tsc --noEmit passes
  • vitest run on working-memory/cognitive-load/scene-predictor/digest-template tests — 58/58 pass
  • After deploy: verify next time the brain marks something afgerond, the structured form is emitted with a valid evidenceObsId
  • After deploy: verify a stray one-word "Oke!" no longer triggers a spurious afgerond on an unrelated tracking item

🤖 Generated with Claude Code

ARIA and others added 2 commits June 1, 2026 10:06
Daily and weekly summaries in working-memory previously cut off at a hard
char count, often mid-word. Replace with smartTruncate(): prefer the last
sentence boundary (. ! ?) within the budget, fall back to last whitespace,
then hard cap. Boundary must lie within 60% of the budget to avoid losing
most of the content to a stray early period.

Same char budgets (200 daily / 300 weekly) — just smarter cutoff.

Intent-summary: Recent-days/Recent-weeks summaries truncated mid-word, costing parse-effort on every reflect tick.
Intent-tokens: truncation, midword, summary, boundary, readability, temporal

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds optional evidence fields (obsId, sender, ts) to shortTermTracking
items so the brain must point at WHICH message proved a status change
when marking items resolved/afgerond. The display format is:
'X afgerond (evidence: msg <id> from <sender> at <ts>)'.

Today's capability-demo failure (11:01-11:41) had Ilse send a one-word
"Oke!" about camping-kleden which the brain mis-attributed as
acknowledgment of an agenda-request that ARIA never actually answered.
Without a structural evidence link, the brain has no constraint forcing
it to verify the "Oke!" actually addressed the open item — it only
needed the strings to feel adjacent.

Implementation:
- New ShortTermTrackingItem union (string | { text, evidenceObsId?, evidenceSender?, evidenceTs? })
- trackingItemText() helper renders structured items with the evidence
  suffix for prompt/UI display
- Brain prompt teaches the model the structured form and explicitly
  states: if you cannot point at a specific message that completes the
  item, do NOT mark it resolved
- Frontend types/views handle both forms via the same helper

Plain-string entries remain valid (back-compat) for in-flight notes.

Intent-summary: working memory hallucinated task completion from an unrelated short acknowledgment because tracking items had no link back to the message that supposedly resolved them
Intent-tokens: evidence, hallucination, completion, working-memory, ack, attribution, audit

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant