Skip to content

fix: prevent idle watcher from unloading model during recording#1085

Merged
cjpais merged 4 commits intocjpais:mainfrom
VirenMohindra:vm/fix-model-idle-unload
Mar 19, 2026
Merged

fix: prevent idle watcher from unloading model during recording#1085
cjpais merged 4 commits intocjpais:mainfrom
VirenMohindra:vm/fix-model-idle-unload

Conversation

@VirenMohindra
Copy link
Copy Markdown
Contributor

@VirenMohindra VirenMohindra commented Mar 18, 2026

Before Submitting This PR

Human Written Description

when model_unload_timeout is set to Immediately, the idle watcher thread unloads the model mid-recording. this kills transcription for any sentence longer than ~10 seconds. bisected to d1da935.

Immediately returns Some(0) from to_seconds(), so the idle watcher (which checks every 10s) sees idle_ms > 0, which is always true, and unloads the model on its next tick, even while the user is still speaking. the model gets pulled out from under the transcription pipeline.

the Immediately variant is already correctly handled by maybe_unload_immediately() which fires after each transcription completes. the idle watcher just needs to skip it.

Related Issues/Discussions

regression introduced in d1da935 (#1051)

Testing

  • bisected: confirmed transcription works on 789a439 (before d1da935), fails on d1da935 (main)
  • with fix: long transcription completes successfully with Immediately setting
  • maybe_unload_immediately still fires after transcription ("Immediately unloading model after transcription" in logs)
  • verify other timeout values (Min2, Min5, etc.) still unload correctly after idle

AI Assistance

  • AI was used (please describe below)

If AI was used:

  • Tools used: claude code
  • How extensively: bisected the regression, investigated root cause, wrote the one-line fix

VirenMohindra and others added 2 commits March 17, 2026 20:06
the idle watcher treats ModelUnloadTimeout::Immediately as a 0s
timeout, causing idle_ms > 0 to always be true. this unloads the
model on the next 10s tick — even while the user is still recording.

the Immediately variant is already handled correctly by
maybe_unload_immediately() which fires after each transcription
completes. the idle watcher should skip it entirely.

fixes regression introduced in d1da935.
@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 18, 2026

I think the fix was incomplete. I added some changes which should help.

Basically there was still a race that existed, if you have it set to 5 seconds, or 2 minutes, you could feasibly have the same issue.

Instead we need to see if we are recording currently. If we are, we cannot unload the model. Also to make sure we don't have another race, we reset the idle_ms time every time the watcher tick fires and we are recording. The debug setting has been changed to 15 seconds, above the 10 second tick rate of the watcher thread. This is so there is no chance of a race on that setting.

Please let me know how the code looks and what you think

This was referenced Mar 19, 2026
@VirenMohindra
Copy link
Copy Markdown
Contributor Author

thanks for the fixes, guard was super helpful

Please let me know how the code looks and what you think

looks good to me, perf is great. tested locally on for this branch and ran several transcriptions including longer sentences (~20s of speech) with immediately set. all completed correctly, immediately unloading model after transcription fires each time

memory still drops from ~700MB to ~84MB physical footprint after each unload. no mid-recording unloads, no errors

lets 🚢 🔥

@cjpais
Copy link
Copy Markdown
Owner

cjpais commented Mar 19, 2026

Sweet thanks for the confirmation I also had no issues when testing :)

@cjpais cjpais merged commit 095f4ac into cjpais:main Mar 19, 2026
4 checks passed
@VirenMohindra VirenMohindra deleted the vm/fix-model-idle-unload branch March 19, 2026 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants