Skip to content

feat(audio): lazy stream close for bluetooth mic latency#747

Merged
cjpais merged 8 commits intocjpais:mainfrom
VirenMohindra:vm/lazy-stream-close
Mar 19, 2026
Merged

feat(audio): lazy stream close for bluetooth mic latency#747
cjpais merged 8 commits intocjpais:mainfrom
VirenMohindra:vm/lazy-stream-close

Conversation

@VirenMohindra
Copy link
Copy Markdown
Contributor

@VirenMohindra VirenMohindra commented Feb 9, 2026

Summary

  • adds opt-in lazy_stream_close setting (off by default, under experimental) that keeps the mic stream open for 30s after recording stops, reducing latency for back-to-back transcriptions
  • when disabled (default), mic closes immediately after recording, preserving current behavior
  • uses a generation counter to cancel pending lazy closes when a new recording starts or mode switches to always-on
  • includes toggle in experimental settings section with translations for all 17 languages
Screenshot 2026-03-17 at 7 46 04 PM

Context

per discussion in PR comments: the original unconditional 30s lazy close degraded bluetooth audio quality (macOS forces HFP/SCO profile while input stream is active). making it opt-in under experimental lets power users who transcribe rapidly enable it while keeping the safe default for everyone else.

Test plan

  • enable experimental features → verify "Keep Mic Open Between Transcriptions" toggle appears
  • toggle on → record → stop → verify mic stays open (check system audio indicator)
  • record again within 30s → verify no reconnection delay
  • wait >30s → verify mic closes automatically
  • toggle off → record → stop → verify mic closes immediately
  • with bluetooth mic: toggle on → verify audio quality degrades during 30s window (expected)
  • with bluetooth mic: toggle off → verify no audio quality impact

AI Assistance

  • AI was used (please describe below)

If AI was used:

  • Tools used: claude code
  • How extensively: reworked from unconditional lazy close to opt-in experimental setting, added setting/command/toggle/i18n across all languages

@VirenMohindra VirenMohindra marked this pull request as ready for review February 9, 2026 04:49
@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Feb 10, 2026

Does this cause audio issues with other applications? That's the main concern I have with this. Cause we're kind of holding a resource that we're not actively using, and I suspect that there could be conflicts with other applications who might try to gain that resource potentially like during video calls or other things. Like I know the audio stuff on Mac OS is pretty weird. And we might just need to ultimately write a better set of drivers for Mac OS to solve this problem at the root level, rather than making patches to the application.

DylanBricar added a commit to DylanBricar/Phonara that referenced this pull request Mar 11, 2026
…LLM env URL

- PR cjpais#477: Graceful error handling in audio recorder worker thread instead of
  panicking when no microphone config is available or stream fails to build
- PR cjpais#747: Lazy stream close for bluetooth mic latency - keeps mic stream open
  for 30s after recording stops in OnDemand mode, eliminating BT activation delay
- PR cjpais#633: Support PHONARA_CUSTOM_LLM_BASE_URL env var to override LLM base URL
  at runtime (useful for local LLM services on dynamic ports)
- PR cjpais#872: Bump macOS minimum system version from 10.13 to 10.15
- PR cjpais#976: Add tests confirming long repeating word stutter collapse works

Co-Authored-By: Claude Opus 4.6 <[email protected]>
@VirenMohindra
Copy link
Copy Markdown
Contributor Author

but it's more nuanced than just holding a resource because during the 30s idle window, the mic isn't just sitting there dormant. the cpal AudioUnit callback keeps firing every ~30ms, the hardware is actively capturing audio, and run_consumer receives every chunk and it just discards them since recording = false. so there's real CPU / power cost even when idle

for built-in and wired mics, this is mostly fine. CoreAudio shares input access between processes, so zoom / facetime won't conflict. the only downside is unnecessary wake-ups

for bluetooth though (airpods, etc), it's a bigger deal. an active input stream forces macOS into the HFP / SCO profile, which is low quality 16kHz mono for BOTH input and output on that device. so for the full 30 seconds, the user's music or podcast audio quality tanks even though handy isn't actually recording, which isn't ideal

so the tradeoff is 30 seconds of active mic capture + degraded bluetooth audio vs ~1-2 seconds of bluetooth reconnection latency on back-to-back transcriptions. worth it for power users who transcribe rapidly, not worth it for everyone else

some ideas to make this smarter:

  1. detect bluetooth via kAudioDevicePropertyTransportType and use a much shorter timeout (5s) vs wired (30s)
  2. make it opt-in: off by default, users who hit the bluetooth latency issue can enable it
  3. just use a shorter default across the board (10s vs 30s)

i'm leaning toward option 2 since the current behavior (immediate close) is the safe default for all hardware. what do you think

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 16, 2026

2 is okay. I think we can maybe do this in experimental for now.

I'm also wondering if we should, to fix other issues potentially, just use the devices default sample rate and always down sample. Our pipeline handles this well already I believe and it may improve things more broadly

This change could be done in another pr

keep the microphone stream open for 30s after recording stops to
reduce latency on back-to-back transcriptions. gated behind an
experimental setting (off by default) since it keeps the mic actively
capturing while idle — degrading bluetooth audio quality on macOS.

adds lazy_stream_close setting, tauri command, toggle in experimental
settings section, and i18n strings.
cancel_recording was unconditionally calling schedule_lazy_close,
keeping the mic open for 30s even when the setting was disabled.
also bump close_generation on AlwaysOn→OnDemand switch to cancel
any stale lazy close timers.
@VirenMohindra
Copy link
Copy Markdown
Contributor Author

VirenMohindra commented Mar 17, 2026

2 is okay. I think we can maybe do this in experimental for now.

sounds good, reworked this to be opt-in under experimental (off by default). added a "keep mic open between transcriptions" toggle that only shows when experimental features are enabled. default behavior (immediate close) is unchanged

I'm also wondering if we should, to fix other issues potentially, just use the devices default sample rate and always down sample. Our pipeline handles this well already I believe and it may improve things more broadly

This change could be done in another pr

for the default sample rate idea opened that as a separate PR: #1084

cjpais and others added 4 commits March 19, 2026 13:39
keep the microphone stream open for 30s after recording stops to
reduce latency on back-to-back transcriptions. gated behind an
experimental setting (off by default) since it keeps the mic actively
capturing while idle — degrading bluetooth audio quality on macOS.

adds lazy_stream_close setting, tauri command, toggle in experimental
settings section, and i18n strings.
cancel_recording was unconditionally calling schedule_lazy_close,
keeping the mic open for 30s even when the setting was disabled.
also bump close_generation on AlwaysOn→OnDemand switch to cancel
any stale lazy close timers.
@cjpais cjpais merged commit cb32d35 into cjpais:main Mar 19, 2026
4 checks passed
@VirenMohindra VirenMohindra deleted the vm/lazy-stream-close branch March 19, 2026 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants