fix: auto-unload model after idle timeout to reduce memory#1051
fix: auto-unload model after idle timeout to reduce memory#1051cjpais merged 2 commits intocjpais:mainfrom
Conversation
|
I was noticing on the latest build the timeout was actually not working at all. Have you experienced this? At least in the case of changing models from the menubar. |
think i found the bug.
so the model loads, then gets unloaded on the very next watcher cycle. it looks like the timeout "doesn't work" but really the timer was just starting from the wrong point pushed a fix, |
9ff999c to
417f576
Compare
0c14750 to
417f576
Compare
|
the changes as they are don't work for me I dont see prints in the console that model unloading is happening. I am also not seeing memory being freed, nor is the UI updating to show the model is unloaded. I think the most critical paths to verify are:
I think realistically we might need a small state machine or something... Or at least very clear logic what happens when transitioning between states. It may be worth making a diagram at least... Posting that here for verification and then having an LLM go implement and verify it ideally. would be great to have it in a testable unit. |
3a2639e to
7006902
Compare
7006902 to
838857a
Compare
|
tested all 10 steps - everything works now. the main issue was a pre-existing bug in script here root cause: fix: added an test results (physical footprint, matches activity monitor)~
full lifecycle: 46 MB -> 1.1 GB -> 660 MB -> 330 MB -> 88 MB
(idle) (loaded) (unloading... OS reclaiming pages)two back-to-back transcriptions, both auto-unloaded, watcher survived the entire session~ 04:17:23 Transcription #1
04:17:35 Model idle 12s > 5s -> unloaded (44ms)
04:18:58 Transcription #2
04:19:05 Model idle 7s > 5s -> unloaded (33ms)also upgraded unload logging from |
8e845e2 to
62a3153
Compare
- Default model_unload_timeout from Never to Min5 - Fix Drop impl: use take() on watcher handle so clones from initiate_model_load() don't kill the watcher thread - Reset last_activity on model load to prevent immediate unload - Upgrade watcher logging from debug to info level - Remove duplicate "unloaded" event (unload_model already emits it)
62a3153 to
6523980
Compare
| if Arc::strong_count(&self.engine) > 1 { | ||
| return; | ||
| } |
There was a problem hiding this comment.
Arc::strong_count docs warn it shouldn't be used for synchronization and that the count can change between the check and the shutdown_signal.store. in practice this only runs during app teardown so it's fine today, but adding this comment so future clone paths don't break this silently. alternatively, an explicit AtomicUsize refcount we control would be more robust, but YAGNI for now since i doubt we will have more clone paths
| impl Default for ModelUnloadTimeout { | ||
| fn default() -> Self { | ||
| ModelUnloadTimeout::Never | ||
| ModelUnloadTimeout::Min5 |
There was a problem hiding this comment.
this only affects fresh installs right? prob worth confirming existing users who never touched this setting won't suddenly get auto-unload behavior after upgrading
- Default model_unload_timeout from Never to Min5 - Fix Drop impl: use take() on watcher handle so clones from initiate_model_load() don't kill the watcher thread - Reset last_activity on model load to prevent immediate unload - Upgrade watcher logging from debug to info level - Remove duplicate "unloaded" event (unload_model already emits it) Co-authored-by: CJ Pais <[email protected]>
summary
handy holds the transcription model in memory indefinitely (default:
Neverunload). on a 16 GB machine, a parakeet model uses ~1 GB, even when the user hasn't transcribed in hoursthis PR adds automatic model unloading after a configurable timeout, reducing idle memory from ~1.1 GB to ~80 MB
changes
Never->Min5: model auto-unloads after 5 minutes of inactivityDropon clones:initiate_model_load()clonesTranscriptionManager, and when the clone drops,Drop::dropsetsshutdown_signal = true, killing the watcher. fixed with anArc::strong_countguard so only the last clone shuts down the watcherlast_activityon model load: without this, switching models after timeout elapsed would immediately unload the just-loaded modelinfo!()level: wasdebug!(), invisible at default log levelunload_model()already emits itmatchinstead ofif let Okso failures are loggedtest results (macOS, parakeet v2, physical footprint = activity monitor)
actual time to reach 88mb is ~30seconds
full lifecycle:
test plan