Memory increase after v0.3.3

<details>
<summary>Log images (Click to expand)</summary>
## V0.3.3
**gemma-4-26B-A4B-it-mlx-mxfp8**

<img width="745" height="281" alt="Image" src="https://github.com/user-attachments/assets/9edb4315-d211-449b-b7aa-b01082d87466" />

<img width="1216" height="241" alt="Image" src="https://github.com/user-attachments/assets/e5684ebc-6ae9-4525-a420-a2c85a3354b9" />

**gemma-4-e4b-it-4bit**

<img width="827" height="188" alt="Image" src="https://github.com/user-attachments/assets/4cbb198d-c1ea-44a5-9d3e-c5e1844bff74" />

**gemma-3-4b-it-qat-4bit**

<img width="699" height="235" alt="Image" src="https://github.com/user-attachments/assets/471f99ea-5fb2-4da3-b243-d7e740d86341" />

## V0.3.2

**gemma-4-26B-A4B-it-mlx-mxfp8**

<img width="750" height="319" alt="Image" src="https://github.com/user-attachments/assets/4abe56bc-66dd-4cf6-93f2-f7fa59b43ceb" />

<img width="1198" height="85" alt="Image" src="https://github.com/user-attachments/assets/81f00c3b-4a5c-49f4-97d9-026f16c3c993" />

**gemma-4-e4b-it-4bit**

<img width="668" height="170" alt="Image" src="https://github.com/user-attachments/assets/2cfe48a9-6b08-4937-b111-394bb57a250d" />

**gemma-3-4b-it-qat-4bit**

<img width="880" height="223" alt="Image" src="https://github.com/user-attachments/assets/26cba9f2-ada7-4f0b-90c5-4c960d3108b9" />

</details>

| model                        | v0.3.2   | v0.3.3   | Increase              |
| ---------------------------- | -------- | -------- | --------------------- |
| gemma-4-26B-A4B-it-mlx-mxfp8 | 26.18 GB | 50.03 GB | **+23.85 GB (91.1%)** |
| gemma-4-e4b-it-4bit          | 6.37 GB  | 9.65 GB  | **+3.28 GB (51.5%)**  |
| gemma-3-4b-it-qat-4bit       | 3.64 GB  | 5.63 GB  | **+1.99 GB (54.7%)**  |


This behavior seems to happen with or without TurboQuant enabled, and with or without a small context window.
Resetting defaults for the model configuration does not seem to affect this.

Could this be related to https://github.com/jundot/omlx/commit/e301d6677eb0e5006415d0ff082a3165850e09e6?

**Update**: issue only seems to happen when model type is VLM, when using LLM the memory problem does not occur as expected. (so for a temporary fix you could set model type in the model settings to LLM)

Device:
MacBook Pro M4 pro 48GB ram
26.4 Beta (25E5233c)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory increase after v0.3.3 #582

V0.3.2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

model	v0.3.2	v0.3.3	Increase
gemma-4-26B-A4B-it-mlx-mxfp8	26.18 GB	50.03 GB	+23.85 GB (91.1%)
gemma-4-e4b-it-4bit	6.37 GB	9.65 GB	+3.28 GB (51.5%)
gemma-3-4b-it-qat-4bit	3.64 GB	5.63 GB	+1.99 GB (54.7%)

Memory increase after v0.3.3 #582

Description

V0.3.2

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions