-
Notifications
You must be signed in to change notification settings - Fork 718
Memory increase after v0.3.3 #582
Copy link
Copy link
Closed
Description
Log images (Click to expand)
## V0.3.3 **gemma-4-26B-A4B-it-mlx-mxfp8**
gemma-4-e4b-it-4bit
gemma-3-4b-it-qat-4bit
V0.3.2
gemma-4-26B-A4B-it-mlx-mxfp8
gemma-4-e4b-it-4bit
gemma-3-4b-it-qat-4bit
| model | v0.3.2 | v0.3.3 | Increase |
|---|---|---|---|
| gemma-4-26B-A4B-it-mlx-mxfp8 | 26.18 GB | 50.03 GB | +23.85 GB (91.1%) |
| gemma-4-e4b-it-4bit | 6.37 GB | 9.65 GB | +3.28 GB (51.5%) |
| gemma-3-4b-it-qat-4bit | 3.64 GB | 5.63 GB | +1.99 GB (54.7%) |
This behavior seems to happen with or without TurboQuant enabled, and with or without a small context window.
Resetting defaults for the model configuration does not seem to affect this.
Could this be related to e301d66?
Update: issue only seems to happen when model type is VLM, when using LLM the memory problem does not occur as expected. (so for a temporary fix you could set model type in the model settings to LLM)
Device:
MacBook Pro M4 pro 48GB ram
26.4 Beta (25E5233c)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels