fix: update weight initialization method in LinearLoRA class by RayenTian · Pull Request #896 · NVIDIA-NeMo/Automodel

RayenTian · 2025-11-27T13:32:46Z

Summary

This PR fixes an incorrect weight initialization method in the LinearLoRA class. The xavier initialization was incorrectly using torch.nn.init.uniform_() instead of the proper xavier_normal_() function.

Problem

In the LinearLoRA class located in nemo_automodel/components/_peft/lora.py, when init_method="xavier" is specified, the code was using a generic uniform distribution initialization instead of the Xavier initialization method. This semantic mismatch could lead to suboptimal initialization and potentially affect model convergence and fine-tuning performance.

Solution

File: nemo_automodel/components/_peft/lora.py

Changed the initialization of lora_A weights from:

torch.nn.init.uniform_(self.lora_A.weight.data)

To the correct Xavier initialization:

nn.init.xavier_normal_(self.lora_A.weight.data)

Impact

Affects: LoRA weight initialization when using xavier method
Scope: Models using LoRA fine-tuning with xavier initialization
Behavior Change: The initialization distribution will now follow the Xavier/Glorot normal distribution instead of a simple uniform distribution, which is the expected behavior for xavier initialization

Signed-off-by: ruit <[email protected]>

copy-pr-bot · 2025-11-27T13:32:49Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

hemildesai · 2025-11-27T15:51:32Z

/ok to test 7ae52ac

fix: update weight initialization method in LinearLoRA class

7ae52ac

Signed-off-by: ruit <[email protected]>

RayenTian requested review from HuiyingLi, adil-a, akoumpa and hemildesai as code owners November 27, 2025 13:32

RayenTian requested a review from joyang-nv November 27, 2025 13:32

hemildesai approved these changes Nov 27, 2025

View reviewed changes

copy-pr-bot Bot temporarily deployed to nemo-ci November 27, 2025 15:51 Inactive

copy-pr-bot Bot temporarily deployed to test November 27, 2025 15:51 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci November 27, 2025 15:58 Inactive

copy-pr-bot Bot temporarily deployed to nemo-ci November 27, 2025 16:22 Inactive

hemildesai merged commit 2d20e33 into main Nov 27, 2025
58 checks passed

hemildesai deleted the ruit/lora_init branch November 27, 2025 23:54

RayenTian mentioned this pull request Dec 1, 2025

feat: LoRA SFT support for DTensorV2 path NVIDIA-NeMo/RL#1556

Merged

2 tasks

linnanwang pushed a commit that referenced this pull request Apr 24, 2026

fix: update weight initialization method in LinearLoRA class (#896)

2719ed3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: update weight initialization method in LinearLoRA class#896

fix: update weight initialization method in LinearLoRA class#896
hemildesai merged 1 commit intomainfrom
ruit/lora_init

RayenTian commented Nov 27, 2025

Uh oh!

copy-pr-bot Bot commented Nov 27, 2025

Uh oh!

hemildesai commented Nov 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RayenTian commented Nov 27, 2025

Summary

Problem

Solution

Impact

Uh oh!

copy-pr-bot Bot commented Nov 27, 2025

Uh oh!

hemildesai commented Nov 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants