Feature Specification: cc-top v1 — Claude Code Monitor

Created: 2026-02-15 Status: Draft Input: cc-top-v1-spec.md — A TUI dashboard acting as a lightweight OTLP collector for monitoring Claude Code instances on macOS.

User Stories & Acceptance Criteria

User Story 1 — OTLP Receiver Accepts Telemetry (Priority: P0)

A developer running cc-top needs the application to listen for OpenTelemetry data on localhost so that Claude Code instances configured with OTLP export automatically push metrics and events to cc-top. Without a working receiver, no data enters the system — this is the foundation every other feature depends on. The receiver must support both gRPC (:4317) and HTTP (:4318) protocols since Claude Code uses gRPC by default but HTTP is a valid OTLP transport.

Why this priority: P0 because the entire application is non-functional without it. Every panel, alert, and statistic depends on OTLP data flowing in.

Independent Test: Start cc-top, send a synthetic OTLP metrics payload via grpcurl or curl to the receiver, and verify the data appears in the internal state store.

Acceptance Scenarios:

Given cc-top is started with default config, When a Claude Code instance sends OTLP metrics via gRPC to localhost:4317, Then cc-top receives the metrics and indexes them by session.id.
Given cc-top is started with default config, When a Claude Code instance sends OTLP logs/events via HTTP to localhost:4318, Then cc-top receives the events and indexes them by session.id.
Given cc-top is started with custom ports configured in config.toml, When a Claude Code instance sends OTLP data to the custom ports, Then cc-top receives the data correctly.
Given cc-top is already running on port 4317, When a second cc-top instance attempts to start, Then the second instance displays a clear error ("port 4317 already in use") and exits with a non-zero code.
Given cc-top is running, When an OTLP payload arrives with malformed protobuf, Then cc-top logs the error, returns an OTLP error response, and continues operating without crash.
Given cc-top is running, When no Claude Code instances are sending data, Then the receiver remains listening and the TUI shows "No data received yet."

User Story 2 — Process Discovery Finds Claude Code Instances (Priority: P0)

A developer wants cc-top to automatically detect all running Claude Code instances on their Mac, showing each one's PID, terminal type, working directory, and telemetry configuration status. This eliminates the need to manually check which sessions are running and whether they're configured to send telemetry. The process scanner uses macOS libproc APIs (no root required) and runs on startup and every 5 seconds.

Why this priority: P0 because process discovery enables the startup screen, telemetry status display, PID-session correlation, and the kill switch. Without it, cc-top is blind to what's running on the machine.

Independent Test: Start cc-top with one or more Claude Code processes running, and verify each appears in the session list with correct PID, terminal, CWD, and telemetry status.

Acceptance Scenarios:

Given a Claude Code process is running as claude binary, When cc-top performs a scan, Then the process appears in the session list with correct PID, terminal type, and CWD.
Given a Claude Code process is running as a node process with @anthropic-ai/claude-code in argv, When cc-top scans, Then it is detected as a Claude Code instance.
Given a Claude Code process has CLAUDE_CODE_ENABLE_TELEMETRY=1 and OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 in its environment, When cc-top reads its env via KERN_PROCARGS2, Then telemetry status shows "Connected" or "Waiting..." with a ✅ icon.
Given a Claude Code process has telemetry enabled but endpoint pointing to :9090, When cc-top scans, Then telemetry status shows "Wrong port" with a ⚠️ icon.
Given a Claude Code process has no telemetry env vars, When cc-top scans, Then status shows "No telemetry" with a ❌ icon.
Given cc-top is running and a new Claude Code process starts, When the next 5-second scan cycle runs, Then the new process appears with a "New" badge.
Given a Claude Code process exits, When the next scan detects it's gone, Then the process remains in the list marked "Exited" with final aggregate stats preserved.
Given a process's env vars are unreadable (zombie, permission issue), When cc-top scans, Then status shows "Unknown" with a ❓ icon.

User Story 3 — PID-to-Session Correlation (Priority: P0)

A developer needs cc-top to link each Claude Code PID (from the process scanner) to its corresponding OTLP session.id so that per-session metrics and events are displayed alongside the correct process information. The primary mechanism is port fingerprinting: tracking the ephemeral source port on inbound OTLP connections and mapping it to PIDs via proc_pidfdinfo(). A timing heuristic (new PID + new session.id within 10 seconds) serves as fallback.

Why this priority: P0 because without correlation, the session list cannot show unified data — process info and OTLP data would be disconnected, making the tool useless for per-session monitoring.

Independent Test: Start two Claude Code sessions, each sending OTLP data. Verify that each session's metrics appear under the correct PID in the session list.

Acceptance Scenarios:

Given a Claude Code process (PID X) connects to cc-top's gRPC port from ephemeral source port Y, When an OTLP request arrives from source port Y carrying session.id Z, Then cc-top correlates PID X to session Z.
Given two Claude Code processes with different PIDs each sending OTLP data, When both are correlated, Then each PID shows only its own session's metrics and events.
Given a new PID appears in the process scanner and a new session.id starts sending OTLP data within 10 seconds, When port fingerprinting fails (e.g., connection already closed), Then the timing heuristic matches PID to session.
Given a correlated session, When the process exits and reconnects with a new PID (restart), Then the old PID is marked "Exited" and the new PID is correlated to the new session.
Given an OTLP session.id arrives but no matching PID is found, When displayed in the session list, Then it appears as "PID: —" with data still visible.

User Story 4 — Settings Merge and Auto-Setup (Priority: P1)

A developer wants cc-top to configure Claude Code's telemetry settings automatically, either via cc-top --setup (non-interactive) or the [E]/[F] TUI keys (interactive). The tool merges OTel environment variables into ~/.claude/settings.json while preserving all unrelated settings. This eliminates the error-prone manual JSON editing process.

Why this priority: P1 because many developers will have Claude Code running without telemetry. This feature turns a multi-step manual process into a single keypress or command, directly increasing adoption.

Independent Test: Run cc-top --setup against a known settings.json, verify the OTel keys are added and all other keys remain untouched. Repeat with missing file, malformed JSON, and read-only file.

Acceptance Scenarios:

Given ~/.claude/settings.json exists with other settings but no OTel env vars, When the user runs cc-top --setup, Then the OTel keys are added to the "env" block and all other keys are preserved.
Given ~/.claude/settings.json does not exist, When the user runs cc-top --setup, Then the file is created with the required OTel env vars in the "env" block.
Given ~/.claude/settings.json has an OTel key with a different value (e.g., endpoint pointing to :9090), When the user presses [E] in the TUI, Then cc-top prompts for confirmation before overwriting that key.
Given ~/.claude/settings.json already has all correct OTel values, When the user runs cc-top --setup, Then no changes are made and a "Already configured" message is shown.
Given ~/.claude/settings.json contains malformed JSON, When the user runs cc-top --setup, Then cc-top creates a backup of the malformed file, displays an error message indicating the JSON is invalid, and does not write.
Given the user does not have write permission to ~/.claude/settings.json, When they attempt setup, Then a clear error message explains the permission problem and no crash occurs.
Given ~/.claude/settings.json uses 4-space indentation, When cc-top writes back, Then the file uses 4-space indentation (preserves original formatting).
Given a Claude Code session has "Wrong port" status, When the user presses [F] in the TUI, Then only OTEL_EXPORTER_OTLP_ENDPOINT is updated in settings.json.

User Story 5 — Session List Panel (Priority: P0)

A developer viewing cc-top's main dashboard needs a session list showing all discovered Claude Code instances with their PID, session ID, terminal, CWD, telemetry status, model, activity status, cost, tokens, and active time. Selecting a session focuses all other panels on it; a "Global" view aggregates all connected sessions. Sessions without telemetry appear greyed out at the bottom.

Why this priority: P0 because the session list is the primary navigation element. All other panels depend on session selection for focused views.

Independent Test: Start cc-top with 3 Claude Code sessions (2 with telemetry, 1 without). Verify all 3 appear with correct data, the non-telemetry session is greyed out at the bottom, and selecting a session updates other panels.

Acceptance Scenarios:

Given cc-top is running with 3 connected sessions, When the session list renders, Then each session shows PID, truncated session ID, terminal, CWD, telemetry icon, model, status, cost, tokens, and active time.
Given a session has had events within the last 30 seconds, When the session list renders, Then its status shows "active".
Given a session has had no events for 30 seconds to 5 minutes, When the session list renders, Then its status shows "idle".
Given a session has had no events for more than 5 minutes, When the session list renders, Then its status shows "done".
Given a process has exited, When the session list renders, Then it shows "exited" with final aggregate stats preserved.
Given sessions with and without telemetry, When the session list renders, Then non-telemetry sessions appear greyed out at the bottom.
Given no session is selected, When the user views the dashboard, Then panels display aggregated "Global" view data from all connected sessions.
Given the user presses ↑/↓ to navigate and Enter to select a session, When a session is selected, Then all other panels update to show only that session's data.
Given a session is selected, When the user presses Esc, Then the view returns to "Global" aggregate.

User Story 6 — Burn Rate Odometer (Priority: P1)

A developer needs to see their Claude Code spending rate at a glance — total session cost, $/hour rate, trend direction, and token velocity — displayed as a large retro-styled digital counter. The color changes from green (< $0.50/hr) to yellow (< $2/hr) to red (>= $2/hr) based on configurable thresholds. This provides an immediate financial feedback loop during development.

Why this priority: P1 because cost visibility is a primary motivation for using cc-top. Developers need instant awareness when spending accelerates.

Independent Test: Send synthetic cost.usage metrics at a known rate, verify the odometer displays the correct $/hour, total, and colour matches the threshold.

Acceptance Scenarios:

Given cc-top is receiving cost.usage metrics, When the burn rate panel renders, Then it shows Total Session Cost as the sum of cost.usage across visible sessions.
Given cost data has been arriving for at least 5 minutes, When the burn rate panel renders, Then $/hour is calculated as the rolling 5-minute average extrapolated to hourly.
Given the current 5-minute cost rate is higher than the previous 5-minute window, When the panel renders, Then an up-arrow trend indicator appears.
Given the $/hour rate is below $0.50 (default threshold), When the panel renders, Then the counter colour is green.
Given the $/hour rate is between $0.50 and $2.00, When the panel renders, Then the counter colour is yellow.
Given the $/hour rate is $2.00 or above, When the panel renders, Then the counter colour is red.
Given the user has custom thresholds in config.toml, When the panel renders, Then the colour thresholds respect the custom values.
Given token.usage counter deltas are available, When the panel renders, Then token velocity (tokens/minute) is displayed.

User Story 7 — Event Stream Panel (Priority: P1)

A developer needs a real-time scrolling feed of Claude Code events (prompts, tool results, API requests, errors, tool decisions) with session attribution, filterable by session, event type, and success/failure. The event stream provides operational awareness of what each Claude Code instance is doing right now.

Why this priority: P1 because the event stream is the primary diagnostic tool. When something goes wrong, this is where the developer looks first.

Independent Test: Send synthetic OTLP log events of each type, verify they appear in the stream with correct formatting. Apply a filter and verify only matching events remain.

Acceptance Scenarios:

Given a user_prompt event arrives, When the event stream renders, Then it shows [session] Prompt (N chars) with content if OTEL_LOG_USER_PROMPTS=1.
Given a tool_result event arrives with success=true, When rendered, Then it shows [session] ToolName ✓ (duration).
Given a tool_result event arrives with success=false and decision=reject, When rendered, Then it shows [session] ToolName ✗ rejected by user.
Given an api_request event arrives, When rendered, Then it shows [session] model → input_tokens in / output_tokens out ($cost) duration.
Given an api_error event arrives, When rendered, Then it shows [session] status_code error_message (attempt N).
Given a tool_decision event arrives, When rendered, Then it shows [session] ToolName accepted/rejected (source).
Given the user presses f, When the filter menu opens, Then the user can filter by session, event type, and success/failure.
Given more than 1000 events have been received (default buffer), When a new event arrives, Then the oldest event is evicted from the buffer.
Given a session is selected in the session list, When the event stream renders, Then only events for that session are shown.

User Story 8 — Alert Engine with Built-in Rules (Priority: P1)

A developer needs cc-top to automatically detect anomalous patterns — cost surges, runaway tokens, command loops, error storms, stale sessions, context pressure, and high tool rejection rates — and display alerts in the bottom bar with optional macOS system notifications. This provides early warning before problems escalate.

Why this priority: P1 because proactive alerting is a key differentiator over passive monitoring. Catching a runaway loop or cost surge early saves real money and time.

Independent Test: For each alert rule, send synthetic OTLP data that triggers the rule's threshold. Verify the alert appears in the panel and (if enabled) fires an osascript notification.

Acceptance Scenarios:

Given $/hour exceeds the configured threshold (default $2/hr), When the alert engine evaluates, Then a "Cost Surge" alert appears in the alerts panel.
Given token velocity exceeds the threshold for more than N minutes, When evaluated, Then a "Runaway Tokens" alert fires.
Given the same bash command (by hash) fails 3+ times within 5 minutes in a session, When evaluated, Then a "Loop Detector" alert fires for that session.
Given semantically similar commands (e.g., npm test, npm run test, npx jest) all fail, When the loop detector normalizes via prefix matching before hashing, Then they are treated as the same command for threshold counting.
Given more than 10 api_error events occur in 1 minute, When evaluated, Then an "Error Storm" alert fires.
Given a session has been active for more than 2 hours (default) with no user_prompt events, When evaluated, Then a "Stale Session" alert fires.
Given an api_request event has input_tokens > 80% of the model's known context limit, When evaluated, Then a "Context Pressure" alert fires.
Given more than 50% of tool_decision events are reject in a 5-minute window, When evaluated, Then a "High Rejection Rate" alert fires.
Given system_notify = true in config.toml, When any alert fires, Then an osascript display notification is triggered.
Given system_notify = false in config.toml, When an alert fires, Then no system notification is sent, but the alert still appears in the TUI panel.
Given all alert thresholds are configurable in config.toml, When the user changes a threshold, Then the alert engine uses the new value.

User Story 9 — Stats Dashboard (Priority: P2)

A developer wants a full-screen statistics view (toggled via Tab) showing aggregate metrics: lines of code, commits, PRs, tool acceptance rate, cache efficiency, average API latency, model breakdown, top tools, and error rate. This provides a summary view for reviewing productivity and cost efficiency after a working session.

Why this priority: P2 because the stats dashboard is a convenience/review feature. The core monitoring capability (session list, events, alerts, burn rate) is more urgent.

Independent Test: Send synthetic metrics covering all stat categories, press Tab to view the stats dashboard, and verify each stat is calculated and displayed correctly.

Acceptance Scenarios:

Given lines_of_code.count metrics have been received, When the stats dashboard renders, Then it shows lines added and removed broken down by type.
Given commit.count and pull_request.count metrics have been received, When rendered, Then commits and PRs counts are displayed.
Given code_edit_tool.decision metrics, When rendered, Then tool acceptance rate is shown as accept / total grouped by tool and language.
Given token.usage metrics with cacheRead and input types, When rendered, Then cache efficiency is cacheRead / (input + cacheRead) as a percentage.
Given api_request events with duration_ms, When rendered, Then average API latency is the mean of all duration_ms values.
Given cost and token data with model attribute, When rendered, Then a model breakdown shows cost and tokens grouped by model.
Given tool_result events, When rendered, Then top tools are ranked by frequency.
Given api_error and api_request event counts, When rendered, Then error rate is api_error count / api_request count as a percentage.
Given the user presses Tab on the main dashboard, When the stats dashboard appears, Then it fills the full screen. Pressing Tab again returns to the main dashboard.

User Story 10 — Kill Switch (Priority: P2)

A developer notices a runaway or problematic Claude Code session and wants to terminate it directly from cc-top without switching terminals. Pressing Ctrl+K freezes the selected session (SIGSTOP), shows a confirmation dialog with session details, and either kills (SIGKILL) or resumes (SIGCONT) based on user choice. This provides an emergency stop without leaving the monitoring context.

Why this priority: P2 because it's an important safety feature but is used infrequently. The developer can always switch terminals and kill manually as a workaround.

Independent Test: Start a Claude Code process, press Ctrl+K in cc-top, verify the process is stopped (SIGSTOP), confirm kill, verify the process is terminated (SIGKILL). Repeat but cancel, and verify the process resumes (SIGCONT).

Acceptance Scenarios:

Given a session is selected in the session list, When the user presses Ctrl+K, Then SIGSTOP is sent to the process group, freezing the Claude Code instance.
Given the process is frozen, When the confirmation dialog appears, Then it shows session ID, PID, and CWD with "Kill session? [Y/n]".
Given the confirmation dialog is showing, When the user presses Y, Then SIGKILL is sent to the process group and the session is marked "Exited".
Given the confirmation dialog is showing, When the user presses n or Esc, Then SIGCONT is sent to resume the process and the dialog closes.
Given no session is selected (global view), When the user presses Ctrl+K, Then a session picker appears listing all active sessions.
Given the target process has already exited between SIGSTOP and user confirmation, When the user confirms kill, Then cc-top handles the "no such process" error gracefully and marks the session "Exited".

User Story 11 — Configuration File (Priority: P2)

A developer wants to customize cc-top's behaviour — ports, scan intervals, alert thresholds, display settings, and model context limits — via a TOML config file at ~/.config/cc-top/config.toml. All settings have sensible defaults, making the config file entirely optional.

Why this priority: P2 because cc-top must work out of the box with zero config. Customization is a nice-to-have for power users.

Independent Test: Start cc-top with no config file and verify defaults work. Create a config file with custom values and verify they take effect.

Acceptance Scenarios:

Given no config file exists, When cc-top starts, Then all settings use default values and the application runs normally.
Given a config file exists with custom grpc_port = 5317, When cc-top starts, Then the gRPC receiver binds to port 5317.
Given a config file with a partial set of keys, When cc-top starts, Then specified keys override defaults and unspecified keys use defaults.
Given a config file with an invalid value (e.g., grpc_port = -1), When cc-top starts, Then it displays a clear validation error and exits.
Given a config file with an unknown key, When cc-top starts, Then the unknown key is ignored and a warning is logged.
Given the config file specifies model context limits and pricing, When the context pressure alert evaluates, Then it uses the configured limits.

User Story 12 — Startup Screen (Priority: P1)

A developer launching cc-top sees an initial screen showing all discovered Claude Code processes with their telemetry status before entering the main dashboard. This screen provides actionable buttons: [E] to enable telemetry for all, [F] to fix misconfigured sessions, and [Enter] to continue to the dashboard. This ensures the developer is aware of and can fix configuration issues before monitoring begins.

Why this priority: P1 because first-run experience determines whether the user continues with the tool. If all sessions show "No telemetry" and there's no obvious fix, the user will abandon cc-top.

Independent Test: Start cc-top with a mix of configured and unconfigured Claude Code sessions. Verify the startup screen shows the correct status for each. Press [E] and verify settings.json is updated.

Acceptance Scenarios:

Given cc-top starts, When the startup screen renders, Then it shows a table of all discovered Claude Code processes with PID, Terminal, CWD, Telemetry status, OTLP Dest, and Status columns.
Given the startup screen is showing, When the user presses [E], Then OTel env vars are merged into ~/.claude/settings.json and a confirmation message appears.
Given the startup screen is showing with a "Wrong port" session, When the user presses [F], Then only the endpoint is fixed in settings.json.
Given the startup screen is showing, When the user presses Enter, Then cc-top transitions to the main dashboard.
Given all sessions are already correctly configured, When the startup screen renders, Then [E] and [F] are greyed out or hidden.
Given no Claude Code processes are found, When the startup screen renders, Then it shows "No Claude Code instances found" with a hint to start one.

User Story 13 — Graceful Shutdown (Priority: P2)

A developer pressing q to quit cc-top expects a clean exit: in-flight OTLP data is drained, listeners are closed, and the terminal is restored to its original state. No data corruption, no dangling port bindings, no broken terminal.

Why this priority: P2 because ungraceful shutdown causes annoying side effects (stuck ports, broken terminal) but is not a core feature.

Independent Test: Start cc-top, send OTLP data, press q, verify the process exits within 5 seconds, the ports are released, and the terminal is restored.

Acceptance Scenarios:

Given cc-top is running and receiving OTLP data, When the user presses q, Then cc-top stops accepting new connections.
Given in-flight OTLP requests are being processed, When shutdown begins, Then cc-top waits up to 5 seconds for them to complete.
Given shutdown has started, When the 5-second drain period expires, Then remaining connections are forcibly closed and cc-top exits.
Given cc-top was using ports 4317 and 4318, When it exits, Then those ports are immediately available for reuse.
Given the Bubble Tea TUI was running, When cc-top exits, Then the terminal is fully restored (cursor visible, input echoing, alternate screen cleared).

Edge Cases

Port conflict on startup: If 4317 or 4318 is already in use by another process (not cc-top), display a clear error naming the conflicting port and the PID of the process using it (via lsof).
Claude Code restarts rapidly: A session exits and a new process starts within seconds. The old PID should be marked "Exited" and the new PID detected in the next scan cycle. OTLP data from the new session should not be attributed to the old PID.
OTLP data without session.id: If an OTLP payload is missing session.id, cc-top should log a warning, display the data under an "Unknown Session" bucket, and continue operating.
Very long CWD paths: CWDs exceeding the column width should be truncated with ~ home-dir substitution and ellipsis (e.g., ~/projects/very-long.../sub).
Zombie processes: Processes in zombie state may be detected by the scanner but have no readable env vars. Show with ❓ status; do not crash or spin.
High-frequency events: If 20 sessions each produce 10 events/second (200 events/sec total), the event stream must not freeze the TUI. Events should be buffered and rendered at the configured refresh rate (default 500ms).
Config file changes while running: cc-top does not hot-reload config. Changes require restart. This should be documented but not enforced.
Model not in context limit map: If an api_request references a model not in [models] config, the context pressure alert cannot fire for it. Log a one-time warning.
Concurrent settings.json writes: If another tool writes to settings.json while cc-top is writing (race condition), cc-top should use file locking or atomic write (write to temp, rename).
Empty event buffer: On first startup with no events yet, the event stream panel should show a placeholder message, not an empty blank area.
Session with zero cost: Sessions that have been active but produced $0.00 cost (e.g., all cache hits) should display $0.00, not be hidden.
Negative cost deltas: If cumulative counters reset (Claude Code restart), the delta calculation could produce negative values. Treat negative deltas as counter resets: set previous value to 0 and calculate rate from there.
Kill switch on exited process: If the user selects a session that has already exited and presses Ctrl+K, display "Session already exited" and do not send signals.
Terminal resize: When the user resizes their terminal window, all TUI panels must re-layout correctly without data loss or crash.

BDD Scenarios

Feature: OTLP Receiver

Background

Given cc-top is started with default configuration
And the gRPC receiver is listening on localhost:4317
And the HTTP receiver is listening on localhost:4318

Scenario: Receive metrics via gRPC

Traces to: User Story 1, Acceptance Scenario 1 Category: Happy Path

Given a Claude Code instance is configured to export OTLP via gRPC to localhost:4317
When the instance sends a claude_code.cost.usage metric with session.id = "sess-001"
Then cc-top's state store contains the cost metric indexed under session.id = "sess-001"
And the session appears in the session list

Scenario: Receive events via HTTP

Traces to: User Story 1, Acceptance Scenario 2 Category: Happy Path

Given a Claude Code instance is configured to export OTLP via HTTP to localhost:4318
When the instance sends a claude_code.api_request log event with session.id = "sess-002"
Then cc-top's state store contains the event indexed under session.id = "sess-002"

Scenario: Receive data on custom ports

Traces to: User Story 1, Acceptance Scenario 3 Category: Alternate Path

Given config.toml specifies grpc_port = 5317 and http_port = 5318
And cc-top is started with this config
When an OTLP metrics payload is sent to localhost:5317
Then cc-top receives and processes the metrics

Scenario: Port already in use on startup

Traces to: User Story 1, Acceptance Scenario 4 Category: Error Path

Given another process is listening on port 4317
When cc-top attempts to start
Then cc-top displays "Error: port 4317 already in use"
And cc-top exits with a non-zero exit code

Scenario: Malformed OTLP payload

Traces to: User Story 1, Acceptance Scenario 5 Category: Error Path

Given cc-top is running and receiving data
When a client sends a payload with invalid protobuf encoding
Then cc-top returns an OTLP error response to the client
And cc-top logs the parse error
And cc-top continues to accept subsequent valid requests

Scenario: No data received yet

Traces to: User Story 1, Acceptance Scenario 6 Category: Edge Case

Given cc-top has been running for 30 seconds
And no Claude Code instances have sent any OTLP data
When the TUI renders
Then the event stream shows "No data received yet"
And the burn rate odometer shows $0.00

Feature: Process Discovery

Scenario: Detect Claude binary process

Traces to: User Story 2, Acceptance Scenario 1 Category: Happy Path

Given a process named claude is running with PID 4821
And the process is owned by the current user
When cc-top performs a process scan
Then PID 4821 appears in the session list
And the terminal type is detected (e.g., "iTerm2")
And the CWD is detected (e.g., "~/myapp")

Scenario: Detect Node-based Claude Code process

Traces to: User Story 2, Acceptance Scenario 2 Category: Alternate Path

Given a node process is running with @anthropic-ai/claude-code in its command-line arguments
When cc-top performs a process scan
Then the process is identified as a Claude Code instance

Scenario Outline: Telemetry status classification

Traces to: User Story 2, Acceptance Scenarios 3-5 Category: Happy Path

Given a Claude Code process with CLAUDE_CODE_ENABLE_TELEMETRY = <telemetry> and OTEL_EXPORTER_OTLP_ENDPOINT = <endpoint>
When cc-top classifies telemetry status
Then the status icon is <icon> and label is <label>

Examples:

telemetry	endpoint	icon	label
`1`	`http://localhost:4317`	✅	Connected
`1`	`http://localhost:9090`	⚠️	Wrong port
`1`	(not set)	⚠️	Console only
`0`	(any)	❌	No telemetry
(not set)	(any)	❌	No telemetry

Scenario: New process appears between scans

Traces to: User Story 2, Acceptance Scenario 6 Category: Happy Path

Given cc-top has completed an initial scan showing 2 processes
And a new Claude Code process starts with PID 7001
When the next 5-second scan cycle completes
Then PID 7001 appears in the session list with a "New" badge

Scenario: Process exits and remains in list

Traces to: User Story 2, Acceptance Scenario 7 Category: Alternate Path

Given a Claude Code process PID 4821 is shown in the session list with $1.50 total cost
When the process exits
And the next scan cycle completes
Then PID 4821 remains in the list marked "Exited"
And the final cost of $1.50 is preserved

Scenario: Unreadable process environment

Traces to: User Story 2, Acceptance Scenario 8 Category: Error Path

Given a Claude Code process whose environment variables cannot be read (zombie or permission denied)
When cc-top performs a scan
Then the process appears with status "Unknown" and a ❓ icon
And cc-top does not crash or hang

Feature: PID-to-Session Correlation

Scenario: Port fingerprinting correlates PID to session

Traces to: User Story 3, Acceptance Scenario 1 Category: Happy Path

Given a Claude Code process PID 4821 has an open socket to cc-top from ephemeral port 52345
When an OTLP request arrives from source port 52345 carrying session.id = "sess-abc"
Then cc-top records the mapping PID 4821 ↔ session "sess-abc"
And subsequent metrics for "sess-abc" are attributed to PID 4821

Scenario: Two sessions are independently correlated

Traces to: User Story 3, Acceptance Scenario 2 Category: Happy Path

Given PID 4821 is correlated to session "sess-abc"
And PID 5102 is correlated to session "sess-def"
When the session list renders
Then PID 4821's row shows only "sess-abc" metrics
And PID 5102's row shows only "sess-def" metrics

Scenario: Timing heuristic fallback

Traces to: User Story 3, Acceptance Scenario 3 Category: Alternate Path

Given a new PID 6200 appears in the process scanner
And port fingerprinting cannot determine the source port
When a new session.id = "sess-xyz" starts sending data within 10 seconds of PID 6200's appearance
Then cc-top uses the timing heuristic to match PID 6200 to session "sess-xyz"

Scenario: Process restart creates new correlation

Traces to: User Story 3, Acceptance Scenario 4 Category: Alternate Path

Given PID 4821 was correlated to session "sess-abc" and is now marked "Exited"
When a new Claude Code process PID 7500 starts and sends data as session "sess-new"
Then PID 7500 is correlated to "sess-new"
And PID 4821 retains its "Exited" status with "sess-abc" data preserved

Scenario: Uncorrelated OTLP session

Traces to: User Story 3, Acceptance Scenario 5 Category: Edge Case

Given an OTLP session "sess-orphan" is sending data
And no PID match is found via port fingerprinting or timing heuristic
When the session list renders
Then "sess-orphan" appears with "PID: —" and its data is still visible

Feature: Settings Merge

Scenario: Add OTel keys to existing settings file

Traces to: User Story 4, Acceptance Scenario 1 Category: Happy Path

Given ~/.claude/settings.json contains {"env": {"MY_VAR": "keep"}, "other_key": true}
When the user runs cc-top --setup
Then the file now contains OTel keys in the "env" block
And MY_VAR is still "keep"
And other_key is still true

Scenario: Create settings file when absent

Traces to: User Story 4, Acceptance Scenario 2 Category: Alternate Path

Given ~/.claude/settings.json does not exist
And ~/.claude/ directory exists
When the user runs cc-top --setup
Then ~/.claude/settings.json is created with the required OTel env vars

Scenario: Prompt before overwriting different value (interactive)

Traces to: User Story 4, Acceptance Scenario 3 Category: Alternate Path

Given ~/.claude/settings.json has "OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:9090"
When the user presses [E] in the TUI
Then cc-top shows "OTEL_EXPORTER_OTLP_ENDPOINT is set to http://localhost:9090, overwrite to http://localhost:4317? [y/N]"

Scenario: Skip overwrite in non-interactive mode

Traces to: User Story 4, Acceptance Scenario 3 Category: Alternate Path

Given ~/.claude/settings.json has "OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:9090"
When the user runs cc-top --setup (non-interactive)
Then cc-top prints a warning about the differing value
And the value is NOT overwritten

Scenario: Already configured — no changes

Traces to: User Story 4, Acceptance Scenario 4 Category: Happy Path

Given ~/.claude/settings.json already contains all correct OTel env vars
When the user runs cc-top --setup
Then no file write occurs
And the message "Already configured" is displayed

Scenario: Malformed JSON in settings file

Traces to: User Story 4, Acceptance Scenario 5 Category: Error Path

Given ~/.claude/settings.json contains {invalid json
When the user runs cc-top --setup
Then cc-top creates a backup at ~/.claude/settings.json.bak
And displays "Error: settings.json contains invalid JSON. Backup saved."
And does not write to the original file

Scenario: Permission denied writing settings

Traces to: User Story 4, Acceptance Scenario 6 Category: Error Path

Given ~/.claude/settings.json exists but is read-only
When the user runs cc-top --setup
Then cc-top displays "Error: permission denied writing to ~/.claude/settings.json"
And does not crash

Scenario: Preserve original indentation

Traces to: User Story 4, Acceptance Scenario 7 Category: Edge Case

Given ~/.claude/settings.json uses 4-space indentation
When cc-top writes back after adding OTel keys
Then the output file uses 4-space indentation

Scenario: Fix wrong port only

Traces to: User Story 4, Acceptance Scenario 8 Category: Alternate Path

Given a session has "Wrong port" status pointing to :9090
When the user presses [F] on the startup screen
Then only OTEL_EXPORTER_OTLP_ENDPOINT is updated in settings.json
And all other keys remain unchanged

Feature: Session List Panel

Scenario: Full session row rendering

Traces to: User Story 5, Acceptance Scenario 1 Category: Happy Path

Given 3 sessions are connected with OTLP data flowing
When the session list panel renders
Then each row shows PID, truncated session ID, terminal type, CWD, telemetry icon, model name, status, running cost, token count, and active time

Scenario Outline: Session activity status

Traces to: User Story 5, Acceptance Scenarios 2-4 Category: Happy Path

Given a session's last event was <time_ago> ago
When the session list renders
Then the session status shows <status>

Examples:

time_ago	status
10 seconds	active
2 minutes	idle
10 minutes	done

Scenario: Exited process retains stats

Traces to: User Story 5, Acceptance Scenario 5 Category: Alternate Path

Given a session has accumulated $2.50 cost and 50k tokens
When the process exits
Then the session row shows "exited" status, $2.50 cost, and 50k tokens

Scenario: Non-telemetry sessions greyed at bottom

Traces to: User Story 5, Acceptance Scenario 6 Category: Alternate Path

Given 2 sessions have telemetry and 1 does not
When the session list renders
Then the non-telemetry session appears greyed out below the telemetry sessions

Scenario: Global aggregate view

Traces to: User Story 5, Acceptance Scenario 7 Category: Happy Path

Given no session is selected
When the dashboard renders
Then the burn rate shows aggregate cost from all sessions
And the event stream shows events from all sessions

Scenario: Select session to focus panels

Traces to: User Story 5, Acceptance Scenario 8 Category: Happy Path

Given 3 sessions are listed
When the user navigates with ↑/↓ and presses Enter on session "sess-abc"
Then the event stream filters to "sess-abc" events only
And the burn rate shows "sess-abc" cost only

Scenario: Esc returns to global view

Traces to: User Story 5, Acceptance Scenario 9 Category: Happy Path

Given session "sess-abc" is selected
When the user presses Esc
Then all panels return to aggregate view showing all sessions

Feature: Burn Rate Odometer

Scenario: Total session cost display

Traces to: User Story 6, Acceptance Scenario 1 Category: Happy Path

Given two connected sessions with cost.usage of $1.00 and $0.50
When the burn rate panel renders in global view
Then Total Session Cost shows $1.50

Scenario: Rolling hourly rate calculation

Traces to: User Story 6, Acceptance Scenario 2 Category: Happy Path

Given $0.25 of cost has been incurred in the last 5 minutes
When the burn rate panel calculates $/hour
Then $/hour shows $3.00 (0.25 * 12)

Scenario: Trend indicator direction

Traces to: User Story 6, Acceptance Scenario 3 Category: Happy Path

Given the current 5-minute cost window is $0.30
And the previous 5-minute window was $0.20
When the panel renders
Then an up-arrow trend indicator is displayed

Scenario Outline: Burn rate colour thresholds

Traces to: User Story 6, Acceptance Scenarios 4-6 Category: Happy Path

Given the $/hour rate is <rate>
And default colour thresholds are configured
When the burn rate odometer renders
Then the counter colour is <colour>

Examples:

rate	colour
$0.25	green
$1.00	yellow
$2.00	red
$5.00	red

Scenario: Custom colour thresholds

Traces to: User Story 6, Acceptance Scenario 7 Category: Alternate Path

Given config.toml sets cost_color_green_below = 1.00 and cost_color_yellow_below = 5.00
And the $/hour rate is $3.00
When the panel renders
Then the counter colour is yellow (between custom green and yellow thresholds)

Scenario: Token velocity display

Traces to: User Story 6, Acceptance Scenario 8 Category: Happy Path

Given token.usage counter increased by 5000 tokens in the last minute
When the burn rate panel renders
Then token velocity shows "5,000 tokens/min"

Feature: Event Stream

Scenario: User prompt event rendering

Traces to: User Story 7, Acceptance Scenario 1 Category: Happy Path

Given a user_prompt event arrives for session "sess-abc" with prompt_length = 342
When the event stream renders
Then it shows "[sess-abc] Prompt (342 chars)"

Scenario: Successful tool result rendering

Traces to: User Story 7, Acceptance Scenario 2 Category: Happy Path

Given a tool_result event arrives with tool_name = "Bash", success = true, duration_ms = 1200
When the event stream renders
Then it shows "[session] Bash ✓ (1.2s)"

Scenario: Rejected tool result rendering

Traces to: User Story 7, Acceptance Scenario 3 Category: Alternate Path

Given a tool_result event arrives with tool_name = "Edit", success = false, decision = "reject"
When the event stream renders
Then it shows "[session] Edit ✗ rejected by user"

Scenario: API request event rendering

Traces to: User Story 7, Acceptance Scenario 4 Category: Happy Path

Given an api_request event with model = "sonnet-4.5", input_tokens = 2100, output_tokens = 890, cost_usd = 0.03, duration_ms = 4200
When the event stream renders
Then it shows "[session] sonnet-4.5 → 2.1k in / 890 out ($0.03) 4.2s"

Scenario: API error event rendering

Traces to: User Story 7, Acceptance Scenario 5 Category: Error Path

Given an api_error event with status_code = 529, error = "overloaded", attempt = 2
When the event stream renders
Then it shows "[session] 529 overloaded (attempt 2)"

Scenario: Tool decision event rendering

Traces to: User Story 7, Acceptance Scenario 6 Category: Happy Path

Given a tool_decision event with tool_name = "Write", decision = "accept", source = "config"
When the event stream renders
Then it shows "[session] Write accepted (config)"

Scenario: Filter events by type

Traces to: User Story 7, Acceptance Scenario 7 Category: Happy Path

Given the event stream contains mixed event types
When the user presses f and selects "api_error" filter
Then only api_error events are displayed

Scenario: Event buffer eviction

Traces to: User Story 7, Acceptance Scenario 8 Category: Edge Case

Given the event buffer is full at 1000 events
When event 1001 arrives
Then the oldest event is removed
And event 1001 is added to the buffer

Scenario: Session-filtered event stream

Traces to: User Story 7, Acceptance Scenario 9 Category: Alternate Path

Given sessions "sess-abc" and "sess-def" are producing events
When the user selects "sess-abc" in the session list
Then only events with session.id = "sess-abc" appear in the stream

Feature: Alert Engine

Scenario: Cost surge alert fires

Traces to: User Story 8, Acceptance Scenario 1 Category: Happy Path

Given the cost surge threshold is $2/hr (default)
And the current $/hour rate exceeds $2.00
When the alert engine evaluates
Then a "Cost Surge" alert appears in the alerts panel
And the alert includes the current rate value

Scenario: Runaway tokens alert fires

Traces to: User Story 8, Acceptance Scenario 2 Category: Happy Path

Given the runaway token threshold is 50,000 tokens/min
And token velocity has exceeded 50,000 tokens/min for 3 consecutive minutes
When the alert engine evaluates
Then a "Runaway Tokens" alert fires

Scenario: Loop detector fires on repeated bash failures

Traces to: User Story 8, Acceptance Scenario 3 Category: Happy Path

Given session "sess-abc" has produced 3 tool_result events in the last 5 minutes
And all have tool_name = "Bash", success = false, and the same bash_command hash
When the alert engine evaluates
Then a "Loop Detector" alert fires for session "sess-abc"

Scenario: Loop detector normalizes similar commands

Traces to: User Story 8, Acceptance Scenario 4 Category: Alternate Path

Given session "sess-abc" has 3 failed Bash events with commands npm test, npm run test, and npx jest
When the loop detector normalizes commands via prefix matching before hashing
Then these are treated as the same command
And the loop detector fires

Scenario: Error storm alert fires

Traces to: User Story 8, Acceptance Scenario 5 Category: Happy Path

Given the error storm threshold is 10 errors per minute
And 11 api_error events have occurred in the last 60 seconds
When the alert engine evaluates
Then an "Error Storm" alert fires

Scenario: Stale session alert fires

Traces to: User Story 8, Acceptance Scenario 6 Category: Happy Path

Given the stale session threshold is 2 hours
And session "sess-abc" has been active for 2.5 hours with zero user_prompt events
When the alert engine evaluates
Then a "Stale Session" alert fires for "sess-abc"

Scenario: Context pressure alert fires

Traces to: User Story 8, Acceptance Scenario 7 Category: Happy Path

Given the context pressure threshold is 80%
And model "claude-sonnet-4-5-20250929" has a context limit of 200,000 tokens
And an api_request event has input_tokens = 165000 (82.5%)
When the alert engine evaluates
Then a "Context Pressure" alert fires

Scenario: High rejection rate alert fires

Traces to: User Story 8, Acceptance Scenario 8 Category: Happy Path

Given in the last 5 minutes, 6 out of 10 tool_decision events are reject
When the alert engine evaluates
Then a "High Rejection Rate" alert fires (60% > 50% threshold)

Scenario: System notification sent when enabled

Traces to: User Story 8, Acceptance Scenario 9 Category: Happy Path

Given system_notify = true in config.toml
When a "Cost Surge" alert fires
Then an osascript display notification is executed with the alert text

Scenario: System notification suppressed when disabled

Traces to: User Story 8, Acceptance Scenario 10 Category: Alternate Path

Given system_notify = false in config.toml
When a "Cost Surge" alert fires
Then no osascript is executed
And the alert still appears in the TUI panel

Scenario: Alert thresholds respect custom config

Traces to: User Story 8, Acceptance Scenario 11 Category: Alternate Path

Given config.toml sets cost_surge_threshold_per_hour = 5.00
And the current $/hour rate is $3.00
When the alert engine evaluates
Then no "Cost Surge" alert fires (below custom threshold)

Scenario: Model not in context limit map

Traces to: User Story 8, Edge Case (unknown model) Category: Edge Case

Given an api_request event references model "claude-experimental-v2"
And that model is not in the [models] config section
When the alert engine evaluates for context pressure
Then no context pressure alert fires for that request
And a one-time warning is logged: "Unknown model context limit: claude-experimental-v2"

Feature: Stats Dashboard

Scenario: Lines of code display

Traces to: User Story 9, Acceptance Scenario 1 Category: Happy Path

Given lines_of_code.count metrics with type=added totalling 150 and type=removed totalling 30
When the stats dashboard renders
Then it shows "Lines added: 150" and "Lines removed: 30"

Scenario: Commits and PRs display

Traces to: User Story 9, Acceptance Scenario 2 Category: Happy Path

Given commit.count = 5 and pull_request.count = 2
When the stats dashboard renders
Then it shows "Commits: 5" and "PRs: 2"

Scenario: Tool acceptance rate display

Traces to: User Story 9, Acceptance Scenario 3 Category: Happy Path

Given code_edit_tool.decision metrics: Edit accept=8, reject=2; Write accept=5, reject=0
When the stats dashboard renders
Then it shows "Edit: 80% accepted" and "Write: 100% accepted"

Scenario: Cache efficiency calculation

Traces to: User Story 9, Acceptance Scenario 4 Category: Happy Path

Given token.usage with type=cacheRead = 80,000 and type=input = 20,000
When the stats dashboard renders
Then cache efficiency shows "80%" (80000 / (20000 + 80000))

Scenario: Average API latency

Traces to: User Story 9, Acceptance Scenario 5 Category: Happy Path

Given 10 api_request events with duration_ms values averaging 3500
When the stats dashboard renders
Then average API latency shows "3.5s"

Scenario: Model breakdown

Traces to: User Story 9, Acceptance Scenario 6 Category: Happy Path

Given cost data for "claude-sonnet-4-5" ($1.00) and "claude-haiku-4-5" ($0.20)
When the stats dashboard renders
Then model breakdown shows each model with its cost and token totals

Scenario: Top tools ranking

Traces to: User Story 9, Acceptance Scenario 7 Category: Happy Path

Given tool_result events: Bash (50), Edit (30), Read (20), Write (10)
When the stats dashboard renders
Then tools are listed in descending order: Bash, Edit, Read, Write

Scenario: Error rate display

Traces to: User Story 9, Acceptance Scenario 8 Category: Happy Path

Given 100 api_request events and 5 api_error events
When the stats dashboard renders
Then error rate shows "5.0%"

Scenario: Tab toggles between dashboard and stats

Traces to: User Story 9, Acceptance Scenario 9 Category: Happy Path

Given the user is viewing the main dashboard
When the user presses Tab
Then the full-screen stats dashboard appears
And pressing Tab again returns to the main dashboard

Feature: Kill Switch

Scenario: Freeze and kill a session

Traces to: User Story 10, Acceptance Scenarios 1-3 Category: Happy Path

Given session "sess-abc" (PID 4821, CWD ~/myapp) is selected
When the user presses Ctrl+K
Then SIGSTOP is sent to PID 4821's process group
And a dialog shows "Kill session sess-abc (PID 4821, ~/myapp)? [Y/n]"
When the user presses Y
Then SIGKILL is sent to PID 4821's process group
And the session is marked "Exited"

Scenario: Cancel kill resumes process

Traces to: User Story 10, Acceptance Scenario 4 Category: Alternate Path

Given PID 4821 has been sent SIGSTOP and the confirmation dialog is showing
When the user presses n
Then SIGCONT is sent to PID 4821's process group
And the dialog closes
And the session resumes normal operation

Scenario: Kill from global view shows picker

Traces to: User Story 10, Acceptance Scenario 5 Category: Alternate Path

Given no session is selected (global view)
And 3 active sessions exist
When the user presses Ctrl+K
Then a session picker appears listing the 3 active sessions

Scenario: Kill switch on already-exited process

Traces to: User Story 10, Acceptance Scenario 6 Category: Error Path

Given PID 4821 exited between the SIGSTOP send and the user confirming
When the user presses Y to kill
Then cc-top receives "no such process" error
And handles it gracefully by marking the session "Exited"
But does not display a crash or error dialog

Scenario: Kill switch on already-exited session (pre-SIGSTOP)

Traces to: User Story 10, Edge Case (kill exited session) Category: Edge Case

Given session "sess-abc" is marked "Exited" in the session list
When the user selects it and presses Ctrl+K
Then cc-top displays "Session already exited"
And no signals are sent

Feature: Configuration

Scenario: Zero-config startup

Traces to: User Story 11, Acceptance Scenario 1 Category: Happy Path

Given no config file exists at ~/.config/cc-top/config.toml
When cc-top starts
Then gRPC listens on 4317, HTTP on 4318, scan interval is 5s, all alert thresholds are defaults
And the application runs normally

Scenario: Custom port configuration

Traces to: User Story 11, Acceptance Scenario 2 Category: Alternate Path

Given config.toml contains grpc_port = 5317
When cc-top starts
Then the gRPC receiver binds to port 5317

Scenario: Partial config with defaults

Traces to: User Story 11, Acceptance Scenario 3 Category: Happy Path

Given config.toml only sets [alerts] cost_surge_threshold_per_hour = 5.00
When cc-top starts
Then the cost surge threshold is $5.00
And all other settings use defaults (gRPC on 4317, scan interval 5s, etc.)

Scenario: Invalid config value

Traces to: User Story 11, Acceptance Scenario 4 Category: Error Path

Given config.toml contains grpc_port = -1
When cc-top starts
Then cc-top displays "Error: grpc_port must be between 1 and 65535"
And exits with a non-zero code

Scenario: Unknown config key ignored

Traces to: User Story 11, Acceptance Scenario 5 Category: Edge Case

Given config.toml contains [receiver] unknown_key = true
When cc-top starts
Then the unknown key is ignored
And a warning is logged: "Unknown config key: receiver.unknown_key"

Scenario: Model context limits from config

Traces to: User Story 11, Acceptance Scenario 6 Category: Alternate Path

Given config.toml sets "claude-new-model" = 300000 under [models]
When an api_request for "claude-new-model" with input_tokens = 250000 arrives
Then context pressure alert evaluates against the 300,000 limit (83.3% > 80% threshold fires)

Feature: Startup Screen

Scenario: Startup screen displays process table

Traces to: User Story 12, Acceptance Scenario 1 Category: Happy Path

Given 3 Claude Code processes are running with mixed telemetry status
When cc-top starts and the startup screen renders
Then a table shows all 3 with PID, Terminal, CWD, Telemetry, OTLP Dest, and Status columns
And a summary line shows "N connected · N misconfigured · N have no telemetry"

Scenario: Enable telemetry for all

Traces to: User Story 12, Acceptance Scenario 2 Category: Happy Path

Given the startup screen shows 2 sessions with "No telemetry"
When the user presses [E]
Then OTel env vars are merged into ~/.claude/settings.json
And a message displays "Settings written. New Claude Code sessions will auto-connect. Existing sessions need restart."

Scenario: Fix misconfigured sessions

Traces to: User Story 12, Acceptance Scenario 3 Category: Alternate Path

Given the startup screen shows a session with "Wrong port"
When the user presses [F]
Then only the OTLP endpoint is updated in settings.json

Scenario: Continue to dashboard

Traces to: User Story 12, Acceptance Scenario 4 Category: Happy Path

Given the startup screen is showing
When the user presses Enter
Then cc-top transitions to the main dashboard view

Scenario: No Claude Code processes found

Traces to: User Story 12, Acceptance Scenario 6 Category: Edge Case

Given no Claude Code processes are running
When the startup screen renders
Then it shows "No Claude Code instances found"
And a hint: "Start a Claude Code session, then press [R] to rescan"

Feature: Graceful Shutdown

Scenario: Clean shutdown stops accepting connections

Traces to: User Story 13, Acceptance Scenario 1 Category: Happy Path

Given cc-top is running and receiving OTLP data
When the user presses q
Then cc-top stops accepting new OTLP connections

Scenario: In-flight requests drain within timeout

Traces to: User Story 13, Acceptance Scenario 2 Category: Happy Path

Given 2 OTLP requests are being processed when shutdown begins
When the 5-second drain period starts
Then both requests complete normally before cc-top exits

Scenario: Forced close after drain timeout

Traces to: User Story 13, Acceptance Scenario 3 Category: Edge Case

Given an OTLP request is hung (not completing)
When the 5-second drain period expires
Then the remaining connection is forcibly closed
And cc-top exits

Scenario: Ports released on exit

Traces to: User Story 13, Acceptance Scenario 4 Category: Happy Path

Given cc-top was using ports 4317 and 4318
When cc-top exits
Then a subsequent process can bind to those ports immediately

Scenario: Terminal restored on exit

Traces to: User Story 13, Acceptance Scenario 5 Category: Happy Path

Given the Bubble Tea TUI was running in the alternate screen
When cc-top exits
Then the terminal cursor is visible
And input echoing is enabled
And the alternate screen is cleared

Feature: Edge Case Handling

Scenario: OTLP data without session.id

Traces to: Edge Cases (missing session.id) Category: Edge Case

Given cc-top receives an OTLP payload with no session.id attribute
When the event processor handles it
Then the data is grouped under an "Unknown Session" bucket
And a warning is logged
And cc-top continues operating

Scenario: Counter reset produces negative delta

Traces to: Edge Cases (negative cost deltas) Category: Edge Case

Given a session's cumulative cost.usage was $5.00 on the last reading
When the next reading is $0.50 (counter reset due to Claude Code restart)
Then cc-top treats the previous value as 0
And calculates the rate from the new value ($0.50)

Scenario: Terminal resize re-layouts panels

Traces to: Edge Cases (terminal resize) Category: Edge Case

Given cc-top is rendering the main dashboard at 120x40 terminal size
When the user resizes the terminal to 80x24
Then all panels re-layout to fit the new dimensions
And no data is lost or corrupted

Scenario: High-frequency events don't freeze TUI

Traces to: Edge Cases (high-frequency events) Category: Edge Case

Given 20 sessions are each producing 10 events per second (200 events/sec total)
When the TUI renders at 500ms intervals
Then events are buffered between renders
And the TUI remains responsive (render completes in < 100ms)

Test-Driven Development Plan

Test Hierarchy

Level	Scope	Purpose
Unit	State store, alert rules, command normalizer, rate calculator, settings merge logic, config parser, telemetry classifier, correlation logic	Validates core logic in isolation
Integration	OTLP receiver + state store, process scanner + correlator, settings merge + filesystem, alert engine + state store, TUI model + state	Validates components work together
E2E	Full startup → data flow → TUI render → shutdown	Validates complete workflows from user perspective

Test Implementation Order

Order	Test Name	Level	Traces to BDD Scenario	Description
1	TestStateStore_IndexMetricBySessionID	Unit	Receive metrics via gRPC	State store correctly indexes a metric by session.id
2	TestStateStore_IndexEventBySessionID	Unit	Receive events via HTTP	State store correctly indexes an event by session.id
3	TestStateStore_MissingSessID	Unit	OTLP data without session.id	Data with no session.id goes to "Unknown Session" bucket
4	TestTelemetryClassifier_Connected	Unit	Telemetry status classification	Classifies telemetry=1 + correct endpoint as "Connected"
5	TestTelemetryClassifier_WrongPort	Unit	Telemetry status classification	Classifies telemetry=1 + wrong endpoint as "Wrong port"
6	TestTelemetryClassifier_ConsoleOnly	Unit	Telemetry status classification	Classifies telemetry=1 + no endpoint as "Console only"
7	TestTelemetryClassifier_NoTelemetry	Unit	Telemetry status classification	Classifies telemetry=0 or absent as "No telemetry"
8	TestTelemetryClassifier_Unknown	Unit	Unreadable process environment	Classifies unreadable env as "Unknown"
9	TestCorrelator_PortFingerprint	Unit	Port fingerprinting correlates PID to session	Maps source port → PID → session.id
10	TestCorrelator_TimingHeuristic	Unit	Timing heuristic fallback	Matches PID to session within 10-second window
11	TestCorrelator_NoMatch	Unit	Uncorrelated OTLP session	Returns "PID: —" for unmatched sessions
12	TestCorrelator_TwoSessions	Unit	Two sessions independently correlated	Two PIDs map to distinct sessions
13	TestSettingsMerge_AddKeys	Unit	Add OTel keys to existing settings	Adds OTel keys, preserves existing keys
14	TestSettingsMerge_CreateFile	Unit	Create settings file when absent	Creates file with correct structure
15	TestSettingsMerge_PreserveIndent	Unit	Preserve original indentation	Detects and preserves 4-space indent
16	TestSettingsMerge_AlreadyConfigured	Unit	Already configured — no changes	No write when all keys correct
17	TestSettingsMerge_MalformedJSON	Unit	Malformed JSON in settings file	Creates backup, returns error
18	TestSettingsMerge_PermissionDenied	Unit	Permission denied writing settings	Returns clear permission error
19	TestSettingsMerge_DifferentValue_NonInteractive	Unit	Skip overwrite in non-interactive mode	Warns but does not overwrite
20	TestSettingsMerge_FixWrongPort	Unit	Fix wrong port only	Updates only endpoint key
21	TestConfigParser_Defaults	Unit	Zero-config startup	All defaults populated when no file
22	TestConfigParser_CustomPorts	Unit	Custom port configuration	grpc_port override works
23	TestConfigParser_PartialConfig	Unit	Partial config with defaults	Specified overrides, rest defaults
24	TestConfigParser_InvalidValue	Unit	Invalid config value	Validation error for grpc_port = -1
25	TestConfigParser_UnknownKey	Unit	Unknown config key ignored	Warns about unknown key
26	TestConfigParser_ModelContextLimits	Unit	Model context limits from config	Custom model limits loaded
27	TestBurnRate_TotalCost	Unit	Total session cost display	Sums cost across sessions
28	TestBurnRate_RollingHourly	Unit	Rolling hourly rate calculation	5-min average extrapolated to hourly
29	TestBurnRate_TrendDirection	Unit	Trend indicator direction	Compares current vs previous 5-min window
30	TestBurnRate_ColourThresholds	Unit	Burn rate colour thresholds	Correct colour for each range
31	TestBurnRate_CustomThresholds	Unit	Custom colour thresholds	Custom config thresholds applied
32	TestBurnRate_TokenVelocity	Unit	Token velocity display	Tokens/min from counter deltas
33	TestBurnRate_CounterReset	Unit	Counter reset produces negative delta	Handles counter reset gracefully
34	TestSessionStatus_Active	Unit	Session activity status (active)	Event within 30s = active
35	TestSessionStatus_Idle	Unit	Session activity status (idle)	30s-5min since last event = idle
36	TestSessionStatus_Done	Unit	Session activity status (done)	>5min since last event = done
37	TestEventFormat_UserPrompt	Unit	User prompt event rendering	Formats user_prompt correctly
38	TestEventFormat_ToolResultSuccess	Unit	Successful tool result rendering	Formats success tool result correctly
39	TestEventFormat_ToolResultReject	Unit	Rejected tool result rendering	Formats rejected tool result correctly
40	TestEventFormat_APIRequest	Unit	API request event rendering	Formats api_request correctly
41	TestEventFormat_APIError	Unit	API error event rendering	Formats api_error correctly
42	TestEventFormat_ToolDecision	Unit	Tool decision event rendering	Formats tool_decision correctly
43	TestEventBuffer_Eviction	Unit	Event buffer eviction	Oldest event evicted at capacity
44	TestAlertCostSurge_Fires	Unit	Cost surge alert fires	Alert fires when rate > threshold
45	TestAlertCostSurge_BelowThreshold	Unit	Alert thresholds respect custom config	No alert below custom threshold
46	TestAlertRunawayTokens_Fires	Unit	Runaway tokens alert fires	Alert fires at sustained high velocity
47	TestAlertLoopDetector_Fires	Unit	Loop detector fires on repeated bash failures	3 identical failed commands in 5min
48	TestAlertLoopDetector_Normalization	Unit	Loop detector normalizes similar commands	npm test/run test/npx jest normalized
49	TestAlertErrorStorm_Fires	Unit	Error storm alert fires	>10 errors in 1 minute
50	TestAlertStaleSession_Fires	Unit	Stale session alert fires	Active >2h with no prompts
51	TestAlertContextPressure_Fires	Unit	Context pressure alert fires	input_tokens > 80% of limit
52	TestAlertContextPressure_UnknownModel	Unit	Model not in context limit map	No alert, log warning
53	TestAlertHighRejection_Fires	Unit	High rejection rate alert fires	>50% reject in 5min
54	TestStatsCalc_LinesOfCode	Unit	Lines of code display	Aggregates added/removed correctly
55	TestStatsCalc_CacheEfficiency	Unit	Cache efficiency calculation	cacheRead/(input+cacheRead) percentage
56	TestStatsCalc_ErrorRate	Unit	Error rate display	api_error/api_request percentage
57	TestStatsCalc_ToolAcceptRate	Unit	Tool acceptance rate display	accept/total by tool and language
58	TestStatsCalc_AvgLatency	Unit	Average API latency	Mean of duration_ms values
59	TestCommandNormalizer_PrefixMatch	Unit	Loop detector normalizes similar commands	Prefix match groups semantically similar commands
60	TestOTLPReceiver_GRPCMetrics	Integration	Receive metrics via gRPC	Send OTLP metrics via gRPC, verify state store
61	TestOTLPReceiver_HTTPEvents	Integration	Receive events via HTTP	Send OTLP events via HTTP, verify state store
62	TestOTLPReceiver_MalformedPayload	Integration	Malformed OTLP payload	Invalid proto returns error, receiver continues
63	TestOTLPReceiver_PortConflict	Integration	Port already in use on startup	Detects port conflict, errors
64	TestProcessScanner_DetectClaude	Integration	Detect Claude binary process	Finds claude process on macOS
65	TestProcessScanner_DetectNodeClaude	Integration	Detect Node-based Claude Code	Finds node+@anthropic-ai process
66	TestProcessScanner_NewProcess	Integration	New process appears between scans	New PID detected in next scan
67	TestProcessScanner_ExitedProcess	Integration	Process exits and remains in list	Exited PID preserved with stats
68	TestCorrelator_PortFingerprintInteg	Integration	Port fingerprinting correlates PID to session	End-to-end port→PID→session
69	TestSettingsMerge_FileSystem	Integration	Add OTel keys to existing settings	Real filesystem read/write/backup
70	TestAlertEngine_WithStateStore	Integration	Cost surge alert fires	Alert engine reads from state store
71	TestAlertNotification_OSAScript	Integration	System notification sent when enabled	osascript called with correct args
72	TestTUIModel_SessionSelection	Integration	Select session to focus panels	Model state updates on selection
73	TestTUIModel_TabToggle	Integration	Tab toggles between dashboard and stats	View state switches on Tab
74	TestKillSwitch_SIGSTOPAndKill	Integration	Freeze and kill a session	SIGSTOP then SIGKILL on real process
75	TestKillSwitch_Cancel_SIGCONT	Integration	Cancel kill resumes process	SIGCONT restores process
76	TestKillSwitch_ExitedProcess	Integration	Kill switch on already-exited process	Handles ESRCH gracefully
77	TestE2E_StartupToDataFlow	E2E	Full startup → receive data → render	Start cc-top, send data, verify TUI output
78	TestE2E_StartupScreen	E2E	Startup screen displays process table	Startup screen with mixed sessions
79	TestE2E_GracefulShutdown	E2E	Clean shutdown stops accepting connections	Press q, verify ports released and terminal restored
80	TestE2E_SessionLifecycle	E2E	Process exits and remains in list	Session from new→active→idle→exited
81	TestE2E_AlertTriggered	E2E	Cost surge alert fires	Send high-rate cost data, verify alert appears
82	TestE2E_KillSwitchFlow	E2E	Freeze and kill a session	Full Ctrl+K → confirm → kill flow

Test Datasets

Dataset: OTLP Payload Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	Valid gRPC ExportMetricsServiceRequest with session.id	Happy path	Metrics stored under session.id	BDD: Receive metrics via gRPC	Standard flow
2	Valid HTTP ExportLogsServiceRequest with session.id	Happy path	Events stored under session.id	BDD: Receive events via HTTP	Standard flow
3	Payload with empty session.id attribute	Edge case	Stored under "Unknown Session"	BDD: OTLP data without session.id	Missing identifier
4	Payload with no attributes at all	Edge case	Stored under "Unknown Session", warning logged	BDD: OTLP data without session.id	Completely bare
5	Invalid protobuf bytes (random garbage)	Error	OTLP error response, logged, no crash	BDD: Malformed OTLP payload	Corruption
6	Empty request body (0 bytes)	Boundary (empty)	OTLP error response	BDD: Malformed OTLP payload	Zero-length
7	Very large payload (10MB, 1000 metrics)	Boundary (max)	Accepted and processed	BDD: High-frequency events	Load test
8	Payload with unknown metric names	Edge case	Stored but not displayed in known panels	BDD: Receive metrics via gRPC	Future-proofing
9	Payload with session.id containing special chars `"sess/abc\n"`	Edge case	Handled, displayed with escaping	BDD: Receive metrics via gRPC	Unusual ID
10	Two rapid payloads with same session.id, different metrics	Concurrency	Both metrics stored, no overwrite	BDD: Receive metrics via gRPC	Rapid fire

Dataset: Process Scanner Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	Process named `claude`, telemetry ON, endpoint :4317	Happy path	✅ Connected	BDD: Telemetry status classification	Standard
2	Node process with `@anthropic-ai/claude-code` in argv	Alternate	Detected as Claude Code	BDD: Detect Node-based Claude Code	Node variant
3	Process named `claude`, telemetry OFF	Happy path	❌ No telemetry	BDD: Telemetry status classification	Unconfigured
4	Process named `claude`, endpoint :9090	Error	⚠️ Wrong port	BDD: Telemetry status classification	Misconfigured
5	Process named `claude`, telemetry ON, no endpoint	Error	⚠️ Console only	BDD: Telemetry status classification	Missing export
6	Zombie process (env unreadable)	Edge case	❓ Unknown	BDD: Unreadable process environment	Zombie state
7	20 simultaneous Claude Code processes	Boundary (max)	All 20 detected and listed	BDD: Detect Claude binary process	Capacity
8	0 Claude Code processes	Boundary (empty)	"No Claude Code instances found"	BDD: No Claude Code processes found	Empty scan
9	Process named `claude-helper` (not Claude Code)	Edge case	Not detected (no false positive)	BDD: Detect Claude binary process	Name similarity
10	Process with very long CWD (300+ chars)	Boundary (max)	Truncated with ellipsis and ~	BDD: Full session row rendering	Long path

Dataset: Settings Merge Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	`{"env": {"MY_VAR": "x"}}` (existing, no OTel keys)	Happy path	OTel keys added, MY_VAR preserved	BDD: Add OTel keys to existing settings	Standard merge
2	File does not exist	Boundary (empty)	File created with OTel keys	BDD: Create settings file when absent	First run
3	`{"env": {"OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:4317"}}`	Happy path	No changes, "Already configured"	BDD: Already configured — no changes	Idempotent
4	`{"env": {"OTEL_EXPORTER_OTLP_ENDPOINT": "http://localhost:9090"}}`	Alternate	Prompt/warn about overwrite	BDD: Prompt before overwriting	Different value
5	`{invalid json`	Error	Backup created, error shown	BDD: Malformed JSON in settings	Parse failure
6	Read-only file (chmod 444)	Error	"Permission denied" message	BDD: Permission denied writing	Filesystem error
7	`{}` (empty JSON object)	Boundary (empty)	`"env"` block created with OTel keys	BDD: Add OTel keys to existing settings	No env block
8	4-space indented JSON	Edge case	Output uses 4-space indentation	BDD: Preserve original indentation	Formatting
9	2-space indented JSON	Edge case	Output uses 2-space indentation	BDD: Preserve original indentation	Default format
10	Tab-indented JSON	Edge case	Output uses tab indentation	BDD: Preserve original indentation	Tab format
11	`{"env": {}, "permissions": ["allow"]}`	Happy path	OTel keys added, permissions preserved	BDD: Add OTel keys to existing settings	Sibling keys
12	Very large settings.json (100KB, many keys)	Boundary (max)	OTel keys added, no truncation	BDD: Add OTel keys to existing settings	Large file

Dataset: Config File Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	No config file	Boundary (empty)	All defaults	BDD: Zero-config startup	First run
2	`grpc_port = 5317`	Happy path	gRPC on 5317	BDD: Custom port configuration	Port override
3	`grpc_port = 0`	Boundary (min)	Validation error	BDD: Invalid config value	Below min
4	`grpc_port = -1`	Boundary (min-1)	Validation error	BDD: Invalid config value	Negative
5	`grpc_port = 65535`	Boundary (max)	Accepted	BDD: Custom port configuration	Max port
6	`grpc_port = 65536`	Boundary (max+1)	Validation error	BDD: Invalid config value	Over max
7	`cost_surge_threshold_per_hour = 0.0`	Boundary (zero)	Accepted (always alerts)	BDD: Alert thresholds respect custom config	Zero threshold
8	`event_buffer_size = 1`	Boundary (min)	Accepted, buffer of 1	BDD: Event buffer eviction	Minimum buffer
9	`event_buffer_size = 0`	Boundary (zero)	Validation error	BDD: Invalid config value	No buffer
10	`unknown_key = "value"`	Edge case	Warning logged, key ignored	BDD: Unknown config key ignored	Unknown key
11	Malformed TOML syntax `[receiver\n grpc_port = "abc"`	Error	Parse error, clear message	BDD: Invalid config value	Syntax error

Dataset: Burn Rate Calculation Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	$0.25 in last 5 minutes	Happy path	$3.00/hr	BDD: Rolling hourly rate calculation	Standard rate
2	$0.00 in last 5 minutes	Boundary (zero)	$0.00/hr, green	BDD: Burn rate colour thresholds	No cost
3	$0.041 in last 5 minutes	Boundary (near green)	$0.49/hr, green	BDD: Burn rate colour thresholds	Just under green
4	$0.042 in last 5 minutes	Boundary (at yellow)	$0.50/hr, yellow	BDD: Burn rate colour thresholds	At threshold
5	$0.167 in last 5 minutes	Boundary (at red)	$2.00/hr, red	BDD: Burn rate colour thresholds	At red
6	$10.00 in last 5 minutes	Boundary (extreme)	$120.00/hr, red	BDD: Burn rate colour thresholds	Very high
7	Previous window $0.20, current $0.30	Happy path	Up arrow	BDD: Trend indicator direction	Increasing
8	Previous window $0.30, current $0.20	Happy path	Down arrow	BDD: Trend indicator direction	Decreasing
9	Previous window $0.20, current $0.20	Edge case	No arrow (flat)	BDD: Trend indicator direction	No change
10	Less than 5 minutes of data	Edge case	Rate shown with caveat or estimated	BDD: Rolling hourly rate calculation	Insufficient data
11	Counter reset: prev $5.00, curr $0.50	Edge case	Treats as reset, rate from $0.50	BDD: Counter reset produces negative delta	Counter reset

Dataset: Alert Rule Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	$/hr = $1.99 (default threshold $2)	Boundary (max-1)	No alert	BDD: Alert thresholds respect custom config	Just below
2	$/hr = $2.00	Boundary (max)	Cost Surge alert fires	BDD: Cost surge alert fires	At threshold
3	$/hr = $2.01	Boundary (max+1)	Cost Surge alert fires	BDD: Cost surge alert fires	Just above
4	2 identical failed commands in 5 min	Boundary (max-1)	No loop alert	BDD: Loop detector fires	Below threshold
5	3 identical failed commands in 5 min	Boundary (max)	Loop Detector alert fires	BDD: Loop detector fires	At threshold
6	3 identical failed commands in 5 min 1 sec	Boundary (time)	No loop alert (outside window)	BDD: Loop detector fires	Window expired
7	10 api_errors in 1 minute	Boundary (max)	No Error Storm (at threshold, need >10)	BDD: Error storm alert fires	At boundary
8	11 api_errors in 1 minute	Boundary (max+1)	Error Storm fires	BDD: Error storm alert fires	Above threshold
9	Session active 1h59m, no prompts	Boundary (max-1)	No Stale Session alert	BDD: Stale session alert fires	Just under
10	Session active 2h0m, no prompts	Boundary (max)	Stale Session fires	BDD: Stale session alert fires	At threshold
11	input_tokens = 159,999 / 200,000 limit	Boundary (79.9%)	No Context Pressure	BDD: Context pressure alert fires	Just under 80%
12	input_tokens = 160,000 / 200,000 limit	Boundary (80%)	Context Pressure fires	BDD: Context pressure alert fires	At threshold
13	5 of 10 tool_decisions = reject	Boundary (50%)	High Rejection Rate fires	BDD: High rejection rate alert fires	At threshold
14	4 of 10 tool_decisions = reject	Boundary (max-1)	No alert (40% < 50%)	BDD: High rejection rate alert fires	Below threshold
15	0 tool_decision events in window	Boundary (zero)	No alert (no data)	BDD: High rejection rate alert fires	Empty window
16	Token velocity 49,999/min sustained	Boundary (max-1)	No Runaway Tokens	BDD: Runaway tokens alert fires	Just under
17	Token velocity 50,000/min sustained	Boundary (max)	Runaway Tokens fires	BDD: Runaway tokens alert fires	At threshold

Dataset: Command Normalization Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	`npm test`	Happy path	Normalized to test-runner group	BDD: Loop detector normalizes	npm variant
2	`npm run test`	Happy path	Same group as `npm test`	BDD: Loop detector normalizes	npm run variant
3	`npx jest`	Happy path	Same group as `npm test`	BDD: Loop detector normalizes	npx variant
4	`python -m pytest`	Happy path	Normalized to pytest group	BDD: Loop detector normalizes	Python test
5	`go test ./...`	Happy path	Normalized to go-test group	BDD: Loop detector normalizes	Go test
6	`ls -la`	Happy path	Stands alone (no normalization)	BDD: Loop detector fires	Not a test command
7	`""` (empty command)	Boundary (empty)	Ignored, no hash computed	BDD: Loop detector fires	Empty
8	Very long command (10KB)	Boundary (max)	Hashed normally	BDD: Loop detector fires	Large command

Dataset: Kill Switch Inputs

#	Input	Boundary Type	Expected Output	Traces to	Notes
1	Active session, user confirms Y	Happy path	SIGSTOP → SIGKILL, marked Exited	BDD: Freeze and kill a session	Standard kill
2	Active session, user presses n	Alternate	SIGSTOP → SIGCONT, resumes	BDD: Cancel kill resumes process	Cancel
3	Active session, user presses Esc	Alternate	SIGSTOP → SIGCONT, resumes	BDD: Cancel kill resumes process	Esc cancel
4	Exited session selected	Edge case	"Session already exited" message	BDD: Kill switch on already-exited (pre-SIGSTOP)	Already gone
5	Process exits between SIGSTOP and confirm	Edge case	ESRCH handled, marked Exited	BDD: Kill switch on already-exited process	Race condition
6	No session selected (global view)	Alternate	Session picker shown	BDD: Kill from global view shows picker	No selection

Regression Test Requirements

No regression impact — new capability. cc-top is a greenfield project with no existing codebase to protect.

Integration seams to protect from the start:

OTLP receiver → state store interface (data ingestion contract)

Process scanner → state store interface (process data contract)

State store → TUI model interface (read contract)

State store → alert engine interface (evaluation contract)

Config parser → all consumers (config contract)

Seam tests are included in the Integration test section above. These protect boundaries between components and should be run as regression tests whenever any component changes.

Functional Requirements

FR-001: System MUST accept OTLP metrics via gRPC on a configurable port (default 4317).
FR-002: System MUST accept OTLP log events via HTTP on a configurable port (default 4318).
FR-003: System MUST index all received OTLP data by session.id attribute.
FR-004: System MUST detect running Claude Code processes using macOS libproc APIs without requiring root.
FR-005: System MUST classify each Claude Code process's telemetry status as Connected, Waiting, Wrong port, Console only, No telemetry, or Unknown.
FR-006: System MUST correlate PIDs to OTLP session.ids using port fingerprinting as the primary method.
FR-007: System SHOULD fall back to a timing heuristic (10-second window) when port fingerprinting fails.
FR-008: System MUST merge OTel environment variables into ~/.claude/settings.json via --setup flag or TUI keys, preserving all unrelated settings.
FR-009: System MUST handle missing settings.json (create), malformed JSON (backup + error), and permission denied (clear error message).
FR-010: System MUST detect and preserve the original indentation style when writing back settings.json.
FR-011: System MUST display a session list showing PID, session ID, terminal, CWD, telemetry status, model, activity status, cost, tokens, and active time.
FR-012: System MUST allow session selection via keyboard (↑/↓/Enter) that focuses all other panels on the selected session.
FR-013: System MUST provide a "Global" aggregate view when no session is selected (Esc to return).
FR-014: System MUST display a burn rate odometer showing total cost, $/hour (rolling 5-minute average), trend arrow, and token velocity.
FR-015: System MUST colour the burn rate counter green/yellow/red based on configurable $/hour thresholds.
FR-016: System MUST display a real-time event stream with formatting specific to each of the 5 event types (user_prompt, tool_result, api_request, api_error, tool_decision).
FR-017: System MUST support filtering the event stream by session, event type, and success/failure.
FR-018: System MUST maintain a configurable event buffer (default 1000) with oldest-first eviction.
FR-019: System MUST evaluate all 7 alert rules (Cost Surge, Runaway Tokens, Loop Detector, Error Storm, Stale Session, Context Pressure, High Rejection Rate) against incoming data.
FR-020: System MUST display triggered alerts in a bottom panel.
FR-021: System SHOULD send macOS system notifications via osascript when alerts fire, if enabled in config.
FR-022: System MUST normalize semantically similar commands (e.g., npm test, npm run test, npx jest) via prefix matching before hashing in the loop detector.
FR-023: System MUST provide a stats dashboard (Tab toggle) showing lines of code, commits, PRs, tool acceptance rate, cache efficiency, average API latency, model breakdown, top tools, and error rate.
FR-024: System MUST provide a kill switch (Ctrl+K) that sends SIGSTOP, shows confirmation, then SIGKILL on confirm or SIGCONT on cancel.
FR-025: System MUST handle the case where the target process exits between SIGSTOP and user confirmation (ESRCH).
FR-026: System MUST load configuration from ~/.config/cc-top/config.toml with all values optional and sensible defaults.
FR-027: System MUST validate config values and display clear errors for invalid values.
FR-028: System MUST display a startup screen showing discovered processes with telemetry status and offering [E], [F], and [Enter] actions.
FR-029: System MUST perform graceful shutdown: stop accepting connections, drain in-flight requests (5-second timeout), release ports, and restore terminal state.
FR-030: System MUST handle OTLP counter resets (negative deltas) by treating the previous value as zero.
FR-031: System MUST re-layout all TUI panels correctly on terminal resize.
FR-032: System SHOULD remain responsive (render in < 100ms) with up to 200 events/second from 20 concurrent sessions.
FR-033: System MUST preserve exited sessions in the list with final aggregate statistics until cc-top exits.
FR-034: System MUST mark newly discovered processes with a "New" badge for one scan cycle.
FR-035: System MUST use platform-specific build tags (//go:build darwin) for macOS libproc code to allow future Linux implementations.
FR-036: System MAY log a one-time warning when an api_request references a model not present in the context limit configuration.

Success Criteria

SC-001: cc-top starts successfully with zero configuration and binds to default ports within 2 seconds.
SC-002: All 8 OTel metrics listed in the spec are correctly received, parsed, and indexed by session.id when sent via gRPC.
SC-003: All 5 OTel events listed in the spec are correctly received, parsed, and indexed by session.id when sent via HTTP.
SC-004: Process scanner detects 100% of running Claude Code processes (both claude binary and Node-based) owned by the current user on each scan cycle.
SC-005: PID-to-session correlation correctly links at least 95% of sessions via port fingerprinting within 10 seconds of first OTLP data arrival.
SC-006: cc-top --setup produces valid JSON in ~/.claude/settings.json with all required OTel keys, and all pre-existing keys are preserved (100% preservation rate).
SC-007: Each of the 7 alert rules fires within one evaluation cycle (< 1 second) when its threshold is met, and does not fire when below threshold.
SC-008: The TUI renders at the configured refresh rate (default 500ms) with up to 20 sessions and 200 events/second without dropping frames or exceeding 100ms per render.
SC-009: Graceful shutdown completes within 6 seconds (5-second drain + 1-second cleanup) and releases all ports.
SC-010: The kill switch successfully terminates a target process with 100% reliability when the user confirms (SIGSTOP → SIGKILL), and resumes with 100% reliability when cancelled (SIGCONT).
SC-011: All burn rate calculations ($/hour, trend, token velocity) are numerically accurate to within $0.01 and 1 token/minute.
SC-012: The event stream correctly formats and displays all 5 event types with the formatting specified in the spec.
SC-013: The stats dashboard correctly calculates all 9 statistics (lines, commits, PRs, acceptance rate, cache efficiency, latency, model breakdown, top tools, error rate).

Traceability Matrix

Requirement	User Story	BDD Scenario(s)	Test Name(s)
FR-001	US-1	Receive metrics via gRPC, Receive data on custom ports	TestStateStore_IndexMetricBySessionID, TestOTLPReceiver_GRPCMetrics
FR-002	US-1	Receive events via HTTP	TestStateStore_IndexEventBySessionID, TestOTLPReceiver_HTTPEvents
FR-003	US-1	Receive metrics via gRPC, Receive events via HTTP, OTLP data without session.id	TestStateStore_IndexMetricBySessionID, TestStateStore_MissingSessID
FR-004	US-2	Detect Claude binary process, Detect Node-based Claude Code	TestProcessScanner_DetectClaude, TestProcessScanner_DetectNodeClaude
FR-005	US-2	Telemetry status classification, Unreadable process environment	TestTelemetryClassifier_Connected, _WrongPort, _ConsoleOnly, _NoTelemetry, _Unknown
FR-006	US-3	Port fingerprinting correlates PID to session, Two sessions independently correlated	TestCorrelator_PortFingerprint, TestCorrelator_TwoSessions
FR-007	US-3	Timing heuristic fallback	TestCorrelator_TimingHeuristic
FR-008	US-4	Add OTel keys to existing settings, Create settings file when absent, Fix wrong port only	TestSettingsMerge_AddKeys, _CreateFile, _FixWrongPort
FR-009	US-4	Malformed JSON in settings, Permission denied writing, Create settings file when absent	TestSettingsMerge_MalformedJSON, _PermissionDenied, _CreateFile
FR-010	US-4	Preserve original indentation	TestSettingsMerge_PreserveIndent
FR-011	US-5	Full session row rendering	TestE2E_StartupToDataFlow
FR-012	US-5	Select session to focus panels	TestTUIModel_SessionSelection
FR-013	US-5	Global aggregate view, Esc returns to global view	TestTUIModel_SessionSelection
FR-014	US-6	Total session cost display, Rolling hourly rate calculation, Trend indicator direction, Token velocity display	TestBurnRate_TotalCost, _RollingHourly, _TrendDirection, _TokenVelocity
FR-015	US-6	Burn rate colour thresholds, Custom colour thresholds	TestBurnRate_ColourThresholds, _CustomThresholds
FR-016	US-7	User prompt event rendering, Successful tool result rendering, Rejected tool result rendering, API request event rendering, API error event rendering, Tool decision event rendering	TestEventFormat_UserPrompt, _ToolResultSuccess, _ToolResultReject, _APIRequest, _APIError, _ToolDecision
FR-017	US-7	Filter events by type, Session-filtered event stream	TestTUIModel_SessionSelection
FR-018	US-7	Event buffer eviction	TestEventBuffer_Eviction
FR-019	US-8	Cost surge alert fires, Runaway tokens alert fires, Loop detector fires, Error storm alert fires, Stale session alert fires, Context pressure alert fires, High rejection rate alert fires	TestAlertCostSurge_Fires, TestAlertRunawayTokens_Fires, TestAlertLoopDetector_Fires, TestAlertErrorStorm_Fires, TestAlertStaleSession_Fires, TestAlertContextPressure_Fires, TestAlertHighRejection_Fires
FR-020	US-8	Cost surge alert fires, System notification suppressed when disabled	TestAlertEngine_WithStateStore
FR-021	US-8	System notification sent when enabled, System notification suppressed when disabled	TestAlertNotification_OSAScript
FR-022	US-8	Loop detector normalizes similar commands	TestAlertLoopDetector_Normalization, TestCommandNormalizer_PrefixMatch
FR-023	US-9	Lines of code display, Commits and PRs display, Tool acceptance rate display, Cache efficiency calculation, Average API latency, Model breakdown, Top tools ranking, Error rate display	TestStatsCalc_LinesOfCode, _CacheEfficiency, _ErrorRate, _ToolAcceptRate, _AvgLatency
FR-024	US-10	Freeze and kill a session, Cancel kill resumes process	TestKillSwitch_SIGSTOPAndKill, _Cancel_SIGCONT
FR-025	US-10	Kill switch on already-exited process	TestKillSwitch_ExitedProcess
FR-026	US-11	Zero-config startup, Custom port configuration, Partial config with defaults	TestConfigParser_Defaults, _CustomPorts, _PartialConfig
FR-027	US-11	Invalid config value	TestConfigParser_InvalidValue
FR-028	US-12	Startup screen displays process table, Enable telemetry for all, Fix misconfigured sessions, Continue to dashboard, No Claude Code processes found	TestE2E_StartupScreen
FR-029	US-13	Clean shutdown stops accepting connections, In-flight requests drain, Forced close after drain timeout, Ports released on exit, Terminal restored on exit	TestE2E_GracefulShutdown
FR-030	US-6	Counter reset produces negative delta	TestBurnRate_CounterReset
FR-031	US-5	Terminal resize re-layouts panels	TestE2E_StartupToDataFlow
FR-032	US-7	High-frequency events don't freeze TUI	TestE2E_StartupToDataFlow
FR-033	US-5	Exited process retains stats, Process exits and remains in list	TestProcessScanner_ExitedProcess, TestE2E_SessionLifecycle
FR-034	US-2	New process appears between scans	TestProcessScanner_NewProcess
FR-035	US-2	Detect Claude binary process	TestProcessScanner_DetectClaude
FR-036	US-8	Model not in context limit map	TestAlertContextPressure_UnknownModel

Assumptions

The developer runs macOS (Darwin) on arm64 or amd64 architecture.
Claude Code is installed and accessible as either a claude binary or a Node.js module (@anthropic-ai/claude-code).
The developer has sufficient permissions to read process info for their own user's processes (no root/sudo required via libproc).
~/.claude/ directory exists or can be created by the user.
The OTel Collector receiver library (go.opentelemetry.io/collector/receiver/otlpreceiver) is stable and supports both gRPC and HTTP OTLP transports.
Bubble Tea / Lipgloss / Bubbles are the TUI framework and provide terminal resize handling, alternate screen buffer, and cursor management.
Claude Code's OTLP payloads conform to the metric/event schema documented in the spec (as verified against official docs, February 2026).
osascript is available on macOS for system notifications.
The proc_listallpids, proc_pidinfo, proc_pidfdinfo, and sysctl(KERN_PROCARGS2) APIs are available on macOS 12+ without deprecation.
Ports 4317 and 4318 are the well-known OTLP ports and are available on the developer's machine by default.
TOML is the configuration format (not YAML or JSON) as specified.
cc-top does not persist data across runs — all state is in-memory.
No authentication or encryption is needed for the OTLP receiver (localhost-only).

Clarifications

2026-02-15

Q: Who are the primary actors? -> A: Solo developer monitoring their own Claude Code sessions on their Mac.
Q: Is everything in the spec v1? -> A: Yes, all features are in scope for v1.
Q: Performance constraints? -> A: No specific hard targets. Should handle ~20 concurrent sessions comfortably.
Q: macOS only? -> A: macOS first, but architecture should allow Linux later via build tags.
Q: Settings edge cases? -> A: Handle all three: missing file (create), malformed JSON (backup + error), permission denied (clear message).
Q: Priority/urgency? -> A: Product with soft deadline — quality matters.
Q: OTLP receiver approach? -> A: Use the official OTel Collector receiver library.
Q: Kill switch cancel behaviour? -> A: SIGSTOP → confirm → SIGKILL. Cancel sends SIGCONT to resume.
Q: Config file required? -> A: Zero-config with sensible defaults; config.toml is optional.
Q: System notifications? -> A: osascript display notification, on/off in config.toml. Keep it simple.
Q: Graceful shutdown? -> A: Yes, drain in-flight data, close listeners, restore terminal, exit cleanly.

FilesExpand file tree

cc-top-v1-plan-spec.md

Latest commit

History

cc-top-v1-plan-spec.md

File metadata and controls

Feature Specification: cc-top v1 — Claude Code Monitor

User Stories & Acceptance Criteria

User Story 1 — OTLP Receiver Accepts Telemetry (Priority: P0)

User Story 2 — Process Discovery Finds Claude Code Instances (Priority: P0)

User Story 3 — PID-to-Session Correlation (Priority: P0)

User Story 4 — Settings Merge and Auto-Setup (Priority: P1)

User Story 5 — Session List Panel (Priority: P0)

User Story 6 — Burn Rate Odometer (Priority: P1)

User Story 7 — Event Stream Panel (Priority: P1)

User Story 8 — Alert Engine with Built-in Rules (Priority: P1)

User Story 9 — Stats Dashboard (Priority: P2)

User Story 10 — Kill Switch (Priority: P2)

User Story 11 — Configuration File (Priority: P2)

User Story 12 — Startup Screen (Priority: P1)

User Story 13 — Graceful Shutdown (Priority: P2)

Edge Cases

BDD Scenarios

Feature: OTLP Receiver

Background

Scenario: Receive metrics via gRPC

Scenario: Receive events via HTTP

Scenario: Receive data on custom ports

Scenario: Port already in use on startup

Scenario: Malformed OTLP payload

Scenario: No data received yet

Feature: Process Discovery

Scenario: Detect Claude binary process

Scenario: Detect Node-based Claude Code process

Scenario Outline: Telemetry status classification

Scenario: New process appears between scans

Scenario: Process exits and remains in list

Scenario: Unreadable process environment

Feature: PID-to-Session Correlation

Scenario: Port fingerprinting correlates PID to session

Scenario: Two sessions are independently correlated

Scenario: Timing heuristic fallback

Scenario: Process restart creates new correlation

Scenario: Uncorrelated OTLP session

Feature: Settings Merge

Scenario: Add OTel keys to existing settings file

Scenario: Create settings file when absent

Scenario: Prompt before overwriting different value (interactive)

Scenario: Skip overwrite in non-interactive mode

Scenario: Already configured — no changes

Scenario: Malformed JSON in settings file

Scenario: Permission denied writing settings

Scenario: Preserve original indentation

Scenario: Fix wrong port only

Feature: Session List Panel

Scenario: Full session row rendering

Scenario Outline: Session activity status

Scenario: Exited process retains stats

Scenario: Non-telemetry sessions greyed at bottom

Scenario: Global aggregate view

Scenario: Select session to focus panels

Scenario: Esc returns to global view

Feature: Burn Rate Odometer

Scenario: Total session cost display

Scenario: Rolling hourly rate calculation

Scenario: Trend indicator direction

Scenario Outline: Burn rate colour thresholds

Scenario: Custom colour thresholds

Scenario: Token velocity display

Feature: Event Stream

Scenario: User prompt event rendering

Scenario: Successful tool result rendering

Scenario: Rejected tool result rendering

Scenario: API request event rendering

Scenario: API error event rendering

Scenario: Tool decision event rendering

Scenario: Filter events by type

Scenario: Event buffer eviction

Scenario: Session-filtered event stream

Feature: Alert Engine