fix: transcription lock-up race condition & add small debounce#824
Merged
cjpais merged 7 commits intocjpais:mainfrom Feb 16, 2026
Merged
Conversation
da72240 to
e7822af
Compare
e7822af to
1a04a56
Compare
5c041cc to
bca2a79
Compare
Contributor
Author
|
Note: refactored to use an actor to be even more robust to future race conditions and be a lot easier to understand conceptually. The original state machine commit is still here. |
bca2a79 to
c493d8c
Compare
c493d8c to
00e7125
Compare
Owner
|
from the quick once over this seems like probably the best change, I'll take a further look edit: I tested this on MacOS and it seems to work great for me |
This was referenced Feb 16, 2026
Closed
This was referenced Feb 16, 2026
mceachen
added a commit
to mceachen/Handy
that referenced
this pull request
Feb 20, 2026
Resolve merge conflicts with main's race condition fix (cjpais#824) and structured outputs (cjpais#706), keeping both FinishGuard/TranscriptionCoordinator and the new streaming VAD modes.
RohanMuppa
pushed a commit
to RohanMuppa/Handy
that referenced
this pull request
Feb 21, 2026
…s#824) * fix: add TranscriptionState to prevent race conditions & add debounce * refactor(transcription): replace state machine with actor coordinator * some cleanup * format * minor cleanup * format --------- Co-authored-by: CJ Pais <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Before Submitting This PR
Please confirm you have done the following:
Human Written Description
Part 1 - Race condition fix
This PR started by using an atomic state machine, but it still felt pretty brittle. I realized the correct way to approach this was just by using a "coordinator" (aka actor pattern) which gets simple "notifications" from the rest of the system, and handles the incoming commands sequentially, naturally brokering access to the transcription state.
Note: This fix doesn't rely on the 30ms debounce, the debounce is just a UX thing as described below.
Other attempts to fix this only used an AtomicBool, which seems good on the surface, but doesn't truly fix the race condition because it either 1) still causes a lockup in some edge cases (conservative lock) or 2) has multiple pipelines processing at the same time, potentially pasting text interleaved and whatnot. If you only guard the keypress, then you can have multiple push to talk / keypresses accidentally queue up multiple transcriptions.
Basically, there's a myriad of edge case race conditions that get categorically eliminated by using an actor here :)
Part 2 - Debounce for UX
Second, this PR adds a 30ms debounce which just helps avoid sound effect spam and needless repeated processing (but again, this is NOT necessary for the correctness guarantee)
Related Issues/Discussions
Fixes #64
Fixes #707
Fixes #462 (closed, but not truly fixed)
Related attempts:
Previous related PR which attempted to mitigate this but caused the lock up accidentally:
Community Feedback
Issues linked above.
Testing
Tested on linux + wayland using SIGUSR2.
I have not tested this on macos, or using push to talk keybinds in the app. If someone else is able to test in those environments that'd be great, or we can merge and see how it works out. I don't think this would cause issues.
Screenshots/Videos (if applicable)
AI Assistance
If AI was used: