Skip to content

Memory increase after v0.3.3 #582

@semvis123

Description

@semvis123
Log images (Click to expand) ## V0.3.3 **gemma-4-26B-A4B-it-mlx-mxfp8** Image Image

gemma-4-e4b-it-4bit

Image

gemma-3-4b-it-qat-4bit

Image

V0.3.2

gemma-4-26B-A4B-it-mlx-mxfp8

Image Image

gemma-4-e4b-it-4bit

Image

gemma-3-4b-it-qat-4bit

Image
model v0.3.2 v0.3.3 Increase
gemma-4-26B-A4B-it-mlx-mxfp8 26.18 GB 50.03 GB +23.85 GB (91.1%)
gemma-4-e4b-it-4bit 6.37 GB 9.65 GB +3.28 GB (51.5%)
gemma-3-4b-it-qat-4bit 3.64 GB 5.63 GB +1.99 GB (54.7%)

This behavior seems to happen with or without TurboQuant enabled, and with or without a small context window.
Resetting defaults for the model configuration does not seem to affect this.

Could this be related to e301d66?

Update: issue only seems to happen when model type is VLM, when using LLM the memory problem does not occur as expected. (so for a temporary fix you could set model type in the model settings to LLM)

Device:
MacBook Pro M4 pro 48GB ram
26.4 Beta (25E5233c)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions