Skip to content

bug(audio): cpal DeviceNotAvailable mid-stream causes channel-count flip-flop #175

@InstaZDLL

Description

@InstaZDLL

Symptom

Mid-stream the cpal callback returned DeviceNotAvailable without any user-driven device switch. The engine reopened the output and the new stream came up with different format parameters than the original, then flipped back a second later, interrupting the same track three times in ~14 seconds:

20:26:30  cpal output stream opened sample_rate=44100 channels=8 sample_format=F32
…
20:29:01  WARN cpal stream error err=DeviceNotAvailable
20:29:13  cpal output stream opened sample_rate=48000 channels=2 sample_format=F32   ← different SR + channels
20:29:13  play_track interrupted finished_track_id=215 listened_ms=113557
20:29:13  decoder picked up new ring producer
20:29:14  cpal output stream opened sample_rate=44100 channels=8 sample_format=F32   ← flipped back
20:29:14  play_track interrupted finished_track_id=215 listened_ms=280               ← interrupted again
20:29:14  decoder picked up new ring producer

User reported no manual device change — Windows audio session reset itself.

Root cause hypothesis

Two layers:

  1. Trigger: Windows can issue a session-invalidation event when a default device changes / a driver restarts / a Bluetooth source flaps. cpal surfaces this as DeviceNotAvailable on the stream's error callback.

  2. Response: audio/output.rs handles the error by rebuilding the stream. The rebuild path re-queries the host default device, but the user-pinned device name (if any) and the previous channel layout aren't preserved across the rebuild — so when the rebuild lands on a different default than before, the channel count and SR change unexpectedly.

  3. Second flip: a few ms later something rebuilds again. Possibly the engine's own set_output_device path firing because the persisted audio.output_device setting points at the original device which has come back online. That second LoadAndPlay interrupts the track barely 280 ms in.

Fix sketch

In the cpal error handler in audio/output.rs:

  1. Capture the original (device_name_pinned, sample_rate, channels) triple at first stream open
  2. On DeviceNotAvailable:
    • Wait a short backoff (250-500 ms) to give the OS time to settle
    • Attempt to reopen the same pinned device first; only fall back to "OS default" after N retries
    • Compare the new format triple to the original; if it changed, log a WARN (so the user sees the layout flip) but proceed
  3. Debounce concurrent rebuild requests so two consecutive DeviceNotAvailable events within ~1 s don't fire two LoadAndPlay interruptions
  4. Don't credit the listened_ms as a real play_event when the interruption is engine-internal (the 280 ms second flip showed up as a partial scrobble candidate)

Acceptance criteria

  • Simulated DeviceNotAvailable (unplug + replug a USB DAC) restores playback on the same device with the same format within ~1 s, no track interruption credited
  • If the OS reset truly changes the default device, the engine settles on one stream (no flip-flop) within ~2 s
  • No double play_event insert for a single user-driven listening session

Notes

Phase 1.a did not touch the audio engine. Pre-existing on main, surfaced through diagnostic logs from a user reporting #TBD seek pop. Both issues should probably be fixed in the same audio-engine pass.

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingrustPull requests that update rust codescope: backendRust/Tauri backend (src-tauri/)

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions