fix: prevent idle watcher from unloading model during recording#1085
fix: prevent idle watcher from unloading model during recording#1085cjpais merged 4 commits intocjpais:mainfrom
Conversation
the idle watcher treats ModelUnloadTimeout::Immediately as a 0s timeout, causing idle_ms > 0 to always be true. this unloads the model on the next 10s tick — even while the user is still recording. the Immediately variant is already handled correctly by maybe_unload_immediately() which fires after each transcription completes. the idle watcher should skip it entirely. fixes regression introduced in d1da935.
|
I think the fix was incomplete. I added some changes which should help. Basically there was still a race that existed, if you have it set to 5 seconds, or 2 minutes, you could feasibly have the same issue. Instead we need to see if we are recording currently. If we are, we cannot unload the model. Also to make sure we don't have another race, we reset the idle_ms time every time the watcher tick fires and we are recording. The debug setting has been changed to 15 seconds, above the 10 second tick rate of the watcher thread. This is so there is no chance of a race on that setting. Please let me know how the code looks and what you think |
|
thanks for the fixes, guard was super helpful
looks good to me, perf is great. tested locally on for this branch and ran several transcriptions including longer sentences (~20s of speech) with immediately set. all completed correctly, immediately unloading model after transcription fires each time memory still drops from ~700MB to ~84MB physical footprint after each unload. no mid-recording unloads, no errors lets 🚢 🔥 |
|
Sweet thanks for the confirmation I also had no issues when testing :) |
Before Submitting This PR
Human Written Description
when
model_unload_timeoutis set toImmediately, the idle watcher thread unloads the model mid-recording. this kills transcription for any sentence longer than ~10 seconds. bisected to d1da935.ImmediatelyreturnsSome(0)fromto_seconds(), so the idle watcher (which checks every 10s) seesidle_ms > 0, which is always true, and unloads the model on its next tick, even while the user is still speaking. the model gets pulled out from under the transcription pipeline.the
Immediatelyvariant is already correctly handled bymaybe_unload_immediately()which fires after each transcription completes. the idle watcher just needs to skip it.Related Issues/Discussions
regression introduced in d1da935 (#1051)
Testing
Immediatelysettingmaybe_unload_immediatelystill fires after transcription ("Immediately unloading model after transcription" in logs)AI Assistance
If AI was used: