Skip to content

Inconsistent config attribute naming: .args vs .config across model implementations #1107

@eddieran

Description

@eddieran

Summary

Inner model classes use inconsistent attribute names for storing ModelArgs — some use self.args, others use self.config. This breaks downstream code that needs to access model configuration programmatically (e.g., hidden_size for activation collection).

The Problem

The outer Model class consistently uses self.args across all implementations. However, inner model classes (e.g., LlamaModel, Gemma4TextModel) are split:

  • self.args — used by ~103 model files (llama, mistral, qwen, phi3, etc.)
  • self.config — used by ~15 model files (gemma4_text, gemma3n, deepseek_v2/v3, plamo2, etc.)

Concrete example

from mlx_lm import load

model, tokenizer = load("google/gemma-4-26b-a4b-it")
inner = model.language_model.model  # Gemma4TextModel

inner.args.hidden_size   # ❌ AttributeError: 'Gemma4TextModel' has no attribute 'args'
inner.config.hidden_size # ✅ works

Compare with Llama:

model, tokenizer = load("meta-llama/Llama-3.1-8B-Instruct")
inner = model.model  # LlamaModel

inner.args.hidden_size   # ✅ works
inner.config.hidden_size # ❌ AttributeError

Affected models (using self.config in inner classes)

gemma4_text.py, gemma3n.py, deepseek.py, deepseek_v2.py, deepseek_v3.py, deepseek_v32.py, plamo.py, plamo2.py, glm4_moe.py, glm4_moe_lite.py, nemotron_h.py, youtu_llm.py, mimo_v2_flash.py, baichuan_m1.py, recurrent_gemma.py

Impact

Any code that accesses inner model configuration needs model-specific branching:

# Workaround users currently need
config = getattr(inner, 'args', None) or getattr(inner, 'config', None)
hidden_size = config.hidden_size

This affects use cases like:

  • Activation collection for mechanistic interpretability
  • Custom forward passes for abliteration / steering
  • Model surgery and weight editing

Suggested Fix

Standardize on self.args across all inner model classes, matching the established convention used by the majority (~103) of implementations. The ModelArgs dataclass is already named args in most __init__ signatures.

A compatibility shim (e.g., a property alias) could ease the transition, but since inner model classes are not part of a documented public API, a direct rename may be sufficient.

Environment

  • mlx-lm version: 0.31.2
  • Model: google/gemma-4-26b-a4b-it
  • Platform: macOS (Apple Silicon)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions