-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
[BUG] [Linux] Audio samples lost during recording — tail of speech consistently truncated #709
Description
Bug Description
When recording audio, the number of samples captured is consistently less than expected based on wall clock time. At 16kHz mono, sample_count / 16000 should approximately equal the elapsed time between TranscribeAction::start and TranscribeAction::stop. In practice, 1-3 seconds of audio are routinely lost from the end of the recording, causing the transcription to miss the last few words spoken before the key is released.
Evidence from logs
All times taken from Handy's own debug log:
| Wall time (s) | Samples | Audio duration (s) | Lost (s) |
|---|---|---|---|
| 5 | 57,600 | 3.60 | 1.4 |
| 9 | 127,680 | 7.98 | 1.0 |
| 12 | 130,080 | 8.13 | 3.9 |
| 15 | 199,200 | 12.45 | 2.6 |
| 6 | 74,400 | 4.65 | 1.4 |
| 13 | 188,160 | 11.76 | 1.2 |
| 7 | 67,200 | 4.20 | 2.8 |
The loss is variable but almost always positive and affects the tail of the recording — the last words spoken before releasing the push-to-talk key are not included in the transcription.
Suspected cause
Audio samples that are in-flight between the last PipeWire buffer delivery and the moment TranscribeAction::stop halts the microphone stream are being discarded. The stream appears to stop immediately rather than draining remaining buffered audio.
Reducing PipeWire's default.clock.quantum from 1024 to 128 mitigates the issue (smaller buffers = less data in-flight at stop time) but does not fully resolve it.
Expected behavior
The stop operation should drain any remaining buffered audio before finalizing the sample buffer that is passed to the transcription model.
System Information
App Version: v0.7.1 (handy-bin AUR)
Operating System: Arch Linux, kernel 6.18.7-arch1-1
Audio: PipeWire 1.4.9, AMD ACP6.3 DMIC (Lenovo ThinkPad T14 Gen 5)
CPU: AMD (8 cores)
Model: Parakeet v3 (also reproduced with Whisper Turbo)