[None][fix] Fix LoRA support for Qwen3 models by achartier · Pull Request #12785 · NVIDIA/TensorRT-LLM

achartier · 2026-04-06T22:35:33Z

Summary by CodeRabbit

Tests
- Added comprehensive end-to-end test coverage for Qwen3 LoRA functionality across both dense and Mixture-of-Experts model configurations, validating adapter creation, inference execution, and output correctness.
Improvements
- Enhanced internal parameter propagation through model decoder layers for improved extensibility.

Description

Qwen3 Dense: pass layer_idx to gatedMLP constructor call
Qwen3 MoE: ad **kwargs to decoder layer call
Add regression tests for both Qwen3 models

Test Coverage

New tests: tests/unittest/_torch/modules/tests_lora_modules/test_qwen3_sanity.py

PR Checklist

Please review the following before submitting your PR:

PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

Signed-off-by: Aurelien Chartier <[email protected]>

achartier · 2026-04-06T22:37:39Z

/bot run --disable-fail-fast

coderabbitai · 2026-04-06T22:39:52Z

📝 Walkthrough

Walkthrough

Updates to Qwen3 model initialization to pass layer index information to GatedMLP components and forward additional keyword arguments through decoder layers. Adds comprehensive LoRA sanity test coverage for both dense and MoE Qwen3 configurations, including adapter creation, inference execution with and without LoRA, and output validation.

Changes

Cohort / File(s)	Summary
Qwen3 Model Parameter Passing `tensorrt_llm/_torch/models/modeling_qwen3.py`, `tensorrt_llm/_torch/models/modeling_qwen3_moe.py`	`Qwen3DecoderLayer` now passes `layer_idx` to `GatedMLP` constructor. `Qwen3MoEModel.forward` forwards `**kwargs` to decoder layer calls to enable propagation of additional runtime options.
LoRA Sanity Testing `tests/unittest/_torch/modules/tests_lora_modules/test_qwen3_sanity.py`	New test module with helper functions for LoRA adapter creation and validation. Implements end-to-end sanity tests for Qwen3 (0.6B) dense and Qwen3-30B-A3B MoE configurations, comparing inference outputs with and without LoRA adapters applied.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: fixing LoRA support for Qwen3 models, which aligns with the changeset's focus on LoRA-related fixes and tests.
Description check	✅ Passed	The PR description provides clear bullet points about the changes (layer_idx parameter, kwargs forwarding, test additions) and identifies the new test file, satisfying core requirements.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

tests/unittest/_torch/modules/tests_lora_modules/test_qwen3_sanity.py (1)

128-149: Minor: Inner loop break doesn't exit outer loop.

When any_differ is set in the inner logprobs comparison loop (line 146), the code breaks out of the inner loop but continues iterating through remaining (lora_out, base_out) pairs unnecessarily. This is functionally correct but slightly inefficient.

♻️ Optional: Use labeled loop pattern or helper

 def _assert_lora_changes_output(out_lora, out_base):
     """Assert that LoRA produces at least one different output (tokens or logprobs)."""
     any_differ = False
     for lora_out, base_out in zip(out_lora, out_base, strict=True):
+        if any_differ:
+            break
         lora_ids = lora_out.outputs[0].token_ids
         base_ids = base_out.outputs[0].token_ids
         if lora_ids != base_ids:
             any_differ = True
             break

         # Even if tokens match, logprobs should differ
         lp_lora = lora_out.outputs[0].logprobs
         lp_base = base_out.outputs[0].logprobs
         if lp_lora and lp_base:
             for lp_w, lp_wo in zip(lp_lora, lp_base, strict=True):
                 val_w = next(iter(lp_w.values())).logprob
                 val_wo = next(iter(lp_wo.values())).logprob
                 if abs(val_w - val_wo) > 1e-6:
                     any_differ = True
                     break
+            if any_differ:
+                break

     assert any_differ, "LoRA outputs identical to base model (same tokens AND same logprobs)"

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@tests/unittest/_torch/modules/tests_lora_modules/test_qwen3_sanity.py` around
lines 128 - 149, The helper _assert_lora_changes_output currently sets
any_differ inside the inner logprobs loop then breaks only that inner loop,
causing the outer for-loop over (lora_out, base_out) to continue unnecessarily;
to fix, make the function return immediately when a difference is detected (or
set any_differ and break the outer loop explicitly) so we stop processing
further pairs as soon as a differing token or logprob is found—update the inner
loop where val_w vs val_wo triggers any_differ to either call return or break
out of both loops (e.g., by returning from _assert_lora_changes_output) to
short-circuit the check.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@tests/unittest/_torch/modules/tests_lora_modules/test_qwen3_sanity.py`:
- Around line 128-149: The helper _assert_lora_changes_output currently sets
any_differ inside the inner logprobs loop then breaks only that inner loop,
causing the outer for-loop over (lora_out, base_out) to continue unnecessarily;
to fix, make the function return immediately when a difference is detected (or
set any_differ and break the outer loop explicitly) so we stop processing
further pairs as soon as a differing token or logprob is found—update the inner
loop where val_w vs val_wo triggers any_differ to either call return or break
out of both loops (e.g., by returning from _assert_lora_changes_output) to
short-circuit the check.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 94e6104d-c733-44c0-8865-a8ce78fa615c

📥 Commits

Reviewing files that changed from the base of the PR and between 2b80f8d and e92e1eb.

📒 Files selected for processing (3)

tensorrt_llm/_torch/models/modeling_qwen3.py
tensorrt_llm/_torch/models/modeling_qwen3_moe.py
tests/unittest/_torch/modules/tests_lora_modules/test_qwen3_sanity.py

tensorrt-cicd · 2026-04-06T22:43:56Z

PR_Github #42005 [ run ] triggered by Bot. Commit: e92e1eb Link to invocation

achartier · 2026-04-06T22:44:24Z

@CodeRabbit review

coderabbitai · 2026-04-06T22:44:30Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

tensorrt-cicd · 2026-04-07T02:57:37Z

PR_Github #42005 [ run ] completed with state SUCCESS. Commit: e92e1eb
/LLM/main/L0_MergeRequest_PR pipeline #32852 completed with status: 'SUCCESS'
Pipeline passed with automatic retried tests. Check the rerun report for details.

CI Report

Link to invocation

Fix LoRA support for Qwen3 models

e92e1eb

Signed-off-by: Aurelien Chartier <[email protected]>

achartier requested review from a team as code owners April 6, 2026 22:35

achartier requested review from 2ez4bz and dongjiyingdjy April 6, 2026 22:35

github-actions Bot assigned achartier Apr 6, 2026

coderabbitai Bot reviewed Apr 6, 2026

View reviewed changes

achartier requested a review from byshiue April 7, 2026 15:20

byshiue approved these changes Apr 8, 2026

View reviewed changes

achartier merged commit ae86b91 into NVIDIA:main Apr 8, 2026
8 checks passed

achartier deleted the qwen3-lora branch April 8, 2026 04:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[None][fix] Fix LoRA support for Qwen3 models#12785

[None][fix] Fix LoRA support for Qwen3 models#12785
achartier merged 1 commit into
NVIDIA:mainfrom
achartier:qwen3-lora

achartier commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

achartier commented Apr 6, 2026

Uh oh!

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

tensorrt-cicd commented Apr 6, 2026

Uh oh!

achartier commented Apr 6, 2026

Uh oh!

coderabbitai Bot commented Apr 6, 2026

Uh oh!

tensorrt-cicd commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

achartier commented Apr 6, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Description

Test Coverage

PR Checklist

GitHub Bot Help

Uh oh!

achartier commented Apr 6, 2026

Uh oh!

coderabbitai Bot commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

tensorrt-cicd commented Apr 6, 2026

Uh oh!

achartier commented Apr 6, 2026

Uh oh!

coderabbitai Bot commented Apr 6, 2026

Uh oh!

tensorrt-cicd commented Apr 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

achartier commented Apr 6, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 6, 2026 •

edited

Loading