Summary
Inner model classes use inconsistent attribute names for storing ModelArgs — some use self.args, others use self.config. This breaks downstream code that needs to access model configuration programmatically (e.g., hidden_size for activation collection).
The Problem
The outer Model class consistently uses self.args across all implementations. However, inner model classes (e.g., LlamaModel, Gemma4TextModel) are split:
self.args — used by ~103 model files (llama, mistral, qwen, phi3, etc.)
self.config — used by ~15 model files (gemma4_text, gemma3n, deepseek_v2/v3, plamo2, etc.)
Concrete example
from mlx_lm import load
model, tokenizer = load("google/gemma-4-26b-a4b-it")
inner = model.language_model.model # Gemma4TextModel
inner.args.hidden_size # ❌ AttributeError: 'Gemma4TextModel' has no attribute 'args'
inner.config.hidden_size # ✅ works
Compare with Llama:
model, tokenizer = load("meta-llama/Llama-3.1-8B-Instruct")
inner = model.model # LlamaModel
inner.args.hidden_size # ✅ works
inner.config.hidden_size # ❌ AttributeError
Affected models (using self.config in inner classes)
gemma4_text.py, gemma3n.py, deepseek.py, deepseek_v2.py, deepseek_v3.py, deepseek_v32.py, plamo.py, plamo2.py, glm4_moe.py, glm4_moe_lite.py, nemotron_h.py, youtu_llm.py, mimo_v2_flash.py, baichuan_m1.py, recurrent_gemma.py
Impact
Any code that accesses inner model configuration needs model-specific branching:
# Workaround users currently need
config = getattr(inner, 'args', None) or getattr(inner, 'config', None)
hidden_size = config.hidden_size
This affects use cases like:
- Activation collection for mechanistic interpretability
- Custom forward passes for abliteration / steering
- Model surgery and weight editing
Suggested Fix
Standardize on self.args across all inner model classes, matching the established convention used by the majority (~103) of implementations. The ModelArgs dataclass is already named args in most __init__ signatures.
A compatibility shim (e.g., a property alias) could ease the transition, but since inner model classes are not part of a documented public API, a direct rename may be sufficient.
Environment
- mlx-lm version: 0.31.2
- Model: google/gemma-4-26b-a4b-it
- Platform: macOS (Apple Silicon)
Summary
Inner model classes use inconsistent attribute names for storing
ModelArgs— some useself.args, others useself.config. This breaks downstream code that needs to access model configuration programmatically (e.g.,hidden_sizefor activation collection).The Problem
The outer
Modelclass consistently usesself.argsacross all implementations. However, inner model classes (e.g.,LlamaModel,Gemma4TextModel) are split:self.args— used by ~103 model files (llama, mistral, qwen, phi3, etc.)self.config— used by ~15 model files (gemma4_text, gemma3n, deepseek_v2/v3, plamo2, etc.)Concrete example
Compare with Llama:
Affected models (using
self.configin inner classes)gemma4_text.py,gemma3n.py,deepseek.py,deepseek_v2.py,deepseek_v3.py,deepseek_v32.py,plamo.py,plamo2.py,glm4_moe.py,glm4_moe_lite.py,nemotron_h.py,youtu_llm.py,mimo_v2_flash.py,baichuan_m1.py,recurrent_gemma.pyImpact
Any code that accesses inner model configuration needs model-specific branching:
This affects use cases like:
Suggested Fix
Standardize on
self.argsacross all inner model classes, matching the established convention used by the majority (~103) of implementations. TheModelArgsdataclass is already namedargsin most__init__signatures.A compatibility shim (e.g., a property alias) could ease the transition, but since inner model classes are not part of a documented public API, a direct rename may be sufficient.
Environment