[Mistral3] attn_implementation not applied to vision_tower.config in Mistral3Config due to init order

### System Info

- `transformers` version: 4.55.0
- Platform: Linux-6.8.0-60-generic-x86_64-with-glibc2.35
- Python version: 3.13.6
- Huggingface_hub version: 0.34.3
- Safetensors version: 0.6.1
- Accelerate version: 1.9.0
- Accelerate config:    not found
- DeepSpeed version: 0.17.4
- PyTorch version (accelerator?): 2.7.1+cu126 (CUDA)
- Tensorflow version (GPU?): not installed (NA)
- Flax version (CPU?/GPU?/TPU?): not installed (NA)
- Jax version: not installed
- JaxLib version: not installed
- Using distributed or parallel set-up in script?: No
- Using GPU in script?: Yes
- GPU type: NVIDIA H100 80GB HBM3

### Who can help?

_No response_

### Information

- [ ] The official example scripts
- [ ] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

```python
from transformers import AutoModelForImageTextToText

model = AutoModelForImageTextToText.from_pretrained(
    "mistralai/Mistral-Small-3.1-24B-Instruct-2503",
    torch_dtype="bfloat16",
    attn_implementation="flash_attention_2",
)

print(model.config._attn_implementation)               # 'flash_attention_2'
print(model.vision_tower.config._attn_implementation)  # 'sdpa'
```

### Expected behavior

Both `model.config._attn_implementation` and `model.vision_tower.config._attn_implementation` should match the passed `attn_implementation` argument.

### Cause

In Mistral3Config, `super().__init__` is called before self.vision_config is initialized.

https://github.com/huggingface/transformers/blob/f4d57f2f0cdff0f63ee74a1f16f442dfaf525231/src/transformers/models/mistral3/configuration_mistral3.py#L88

The `super().__init__` call triggers the `_attn_implementation` setter, which attempts to update the vision config — but at this point vision_config does not exist yet, so the update is skipped.

https://github.com/huggingface/transformers/blob/f4d57f2f0cdff0f63ee74a1f16f442dfaf525231/src/transformers/configuration_utils.py#L419

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Mistral3] attn_implementation not applied to vision_tower.config in Mistral3Config due to init order #40062

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Cause

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Mistral3] attn_implementation not applied to vision_tower.config in Mistral3Config due to init order #40062

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Cause

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions