Skip to content

Capture more diagnostic context in support bundle and prompt to create one on Reload Window #955

@EhabY

Description

@EhabY

Summary

Three gaps in the new VS Code-side support bundle (#916) and one UX gap around when users take the bundle. Rolling them together since they're all "make the bundle answer the question without a follow-up round trip."

1. Capture user settings

Add vscode-logs/settings.json with .inspect()-style breakdown so reviewers can tell defaults from user/workspace overrides. This is the most common follow-up question on SshResolverError.Create() triage — useLocalServer, useExecServer, connectTimeout, reconnectionGraceTime, serverShutdownTimeout (Cursor), logLevel all directly change the failure path.

{
  "remote.SSH.useLocalServer":     { "effective": true,  "globalValue": true, "defaultValue": true, ... },
  "remote.SSH.connectTimeout":     { "effective": 1800,  "globalValue": 1800, "defaultValue": 60,   ... },
  "remote.SSH.reconnectionGraceTime": { "effective": 86400, ... },
  "coder.proxyLogDirectory":       { "effective": "", ... },
  ...
}

Allowlist: all remote.SSH.* and remote.* keys (enumerable via inspect); all coder.* keys enumerable from this extension's contributes.configuration. Redact coder.headerCommand and coder.binarySource values to "<set>" / "<empty>" — may contain auth-related shell or private URLs. Paths verbatim.

2. Capture the actual Remote-SSH extension log

vscode-logs/remote-ssh/ today contains the active coder ssh --stdio proxy log (because sshMonitor.getLogFilePath() returns the proxy log path, despite the field name remoteSshLogPath). The actual Remote-SSH extension log — output_logging_*/1-Remote - SSH.log, which carries resolveAuthority, [reconnect] connecting to Managed(N), Unknown reconnection token (never seen), Time limit reached, permanent error … give up now — is not in the bundle. That's the file we need most for Q14-style diagnosis.

Resolve it via the same path-finding the monitor already does (findRemoteSshLogPath at sshProcess.ts:492) and include the file at vscode-logs/remote-ssh/1-Remote - SSH.log. Keep the current proxy log as a separate entry (don't lose it — just stop overloading the folder name). Suggested rename: current → vscode-logs/proxy/active.log (symlink-style duplicate of the file already in proxy/), new → vscode-logs/remote-ssh/1-Remote - SSH.log.

3. Capture cross-session extension logs

getCodeLogDir() returns the current session's per-extension log directory only. Rotated siblings sit at <logsRoot>/<sessionTimestamp>/window<N>/exthost/coder.coder-remote/Coder.log across every prior VS Code session. Cross-session debugging — exactly what Q14 needs to correlate "session at 12:11 worked" → "session at 13:11 failed with token mismatch" — has only the current session's Coder.log to go on.

Walk dirname(codeLogPath)'s parent (the logs/ root) for the last N days (the existing LOG_MAX_AGE_MS = 3 days filter applies cleanly) and include every */window*/exthost/coder.coder-remote/*.log under vscode-logs/extension/<sessionTimestamp>/window<N>/. Same mtime filter as collectDirFiles. Same for Remote-SSH's output_logging_* directories — bundle every recent one, not just the active session's.

4. Prompt to create a bundle on Reload Window

Right now the user-facing failure is a VS Code-native "Reload Window" dialog, and by the time they think to capture diagnostics, the failing session's volatile state is often already gone (PIDs, sockets, the ephemeral coder ssh zombie). Detect failure conditions proactively and offer the bundle in a notification before the user reloads.

Signals available without new APIs:

  • SshProcessMonitor already polls 1-Remote - SSH.log (sshProcess.ts:407). Extend the tail to grep for failure markers: "permanent error", "will give up now", "Unknown reconnection token", "socketFactory.connect() failed or timed out", "Time limit reached", "UnparsableOutput". Any match → fire a new _onConnectionFailed event.
  • _onPidChange.fire(undefined) already covers process-lost; pair it with a time-since-last-pid-recovery threshold (>30 s) to avoid noisy fires during normal reconnect cycles.

UX:

vscode.window.showWarningMessage(
  "Coder: remote session failed and won't auto-reconnect. Capture diagnostics now?",
  "Create support bundle", "Dismiss"
)

Clicking "Create support bundle" runs the existing coder.supportBundle command. Suggested behavior: include a "remember dismiss for this session" toggle so users who reload through many failures aren't pestered.

Acceptance criteria

  • vscode-logs/settings.json with allowlist + redaction.
  • vscode-logs/remote-ssh/1-Remote - SSH.log is the actual extension log; proxy log no longer collides with that path.
  • vscode-logs/extension/<session>/window<N>/Coder.log captures rotated logs ≤3 days old; Remote-SSH output_logging_* siblings same.
  • Failure markers in 1-Remote - SSH.log trigger a notification offering coder.supportBundle.
  • Tests: redaction list; multi-session log walking with mtime filter; failure-marker matcher per known string.

Related

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions