Skip to content

[https://nvbugs/6141606][fix] Move the layer_types derivation into Qwen3HybridConfig.from_hf (where `pretr#13832

Merged
longlee0622 merged 1 commit into
NVIDIA:mainfrom
tensorrt-cicd:repair-bot-bug6141606
May 22, 2026

Conversation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

@tensorrt-cicd tensorrt-cicd commented May 7, 2026

Summary

  • Root cause: set_values_if_none called load_pretrained_config(self.name, ...) where self.name is the bench alias (e.g. qwen3.5_9b_hf), not the checkpoint path, so transformers tried to resolve it as an HF hub repo id and failed with 401/OSError.
  • Fix: Move the layer_types derivation into Qwen3HybridConfig.from_hf (where pretrained_config is already correctly loaded from hf_model_path) and pass the resulting num_attention_layers/num_linear_attention_layers through the constructor, so the validator never needs to re-resolve by name.
  • Automated fix generated by repair-bot

Test plan

  • Verify fix on the same GPU type as the original failure
  • Check for regressions in related tests

Links

Summary by CodeRabbit

  • Refactor
    • Streamlined Qwen3 hybrid model configuration initialization by computing attention layer type counts during model loading instead of post-processing. This improves configuration accuracy and efficiency when working with Qwen3 hybrid models.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Qwen3HybridConfig refactors default value derivation from post-initialization validation to the from_hf classmethod. The constructor now loads the pretrained HuggingFace config, computes hybrid layer type counts, and sets defaults for num_attention_layers and num_linear_attention_layers before instance creation, removing the prior @model_validator decorator.

Changes

Qwen3HybridConfig Derivation Migration

Layer / File(s) Summary
Field Schema
tensorrt_llm/bench/build/dataclasses.py
Class field declarations for linear-attention parameters and mamba_ssm_cache_dtype remain; the @model_validator(mode="after") method set_values_if_none is removed.
Constructor Logic
tensorrt_llm/bench/build/dataclasses.py
from_hf classmethod is updated to load pretrained config, invoke get_qwen3_hybrid_layer_types to compute layer type counts, and set num_attention_layers and num_linear_attention_layers defaults before instantiating the model.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Description check ❓ Inconclusive The PR description explains the root cause, fix, and test plan but lacks the structured format requested by the template (sections for Description, Test Coverage, Checklist). Restructure the description to follow the template sections (Description, Test Coverage, PR Checklist) for consistency and clarity.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: moving layer_types derivation logic from a post-init validator into the from_hf constructor.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tensorrt_llm/bench/build/dataclasses.py (1)

275-276: ⚡ Quick win

Add explicit type annotations to from_hf.

This modified method should annotate inputs and return type for mypy/static checks.

Proposed fix
     `@classmethod`
-    def from_hf(cls, model_hf_name, hf_model_path):
+    def from_hf(cls, model_hf_name: str, hf_model_path: str | None) -> "Qwen3HybridConfig":

As per coding guidelines, "Always annotate functions; make the return type None if the function does not return anything".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/bench/build/dataclasses.py` around lines 275 - 276, Annotate the
classmethod signature for from_hf by adding explicit types: change def
from_hf(cls, model_hf_name, hf_model_path): to def from_hf(cls, model_hf_name:
str, hf_model_path: Optional[str]) -> Self (or -> "YourClassName" if not using
Python 3.11), and add the required imports (from typing import Optional and, if
available, Self; otherwise import typing and use typing.Any or the concrete
class name). Ensure the annotation matches other dataclasses in this module and
that mypy/static checks will accept the return type.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tensorrt_llm/bench/build/dataclasses.py`:
- Around line 275-276: Annotate the classmethod signature for from_hf by adding
explicit types: change def from_hf(cls, model_hf_name, hf_model_path): to def
from_hf(cls, model_hf_name: str, hf_model_path: Optional[str]) -> Self (or ->
"YourClassName" if not using Python 3.11), and add the required imports (from
typing import Optional and, if available, Self; otherwise import typing and use
typing.Any or the concrete class name). Ensure the annotation matches other
dataclasses in this module and that mypy/static checks will accept the return
type.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 34a2b780-3527-4677-abcb-e9bb987f2219

📥 Commits

Reviewing files that changed from the base of the PR and between c20b192 and 68ad06d.

📒 Files selected for processing (1)
  • tensorrt_llm/bench/build/dataclasses.py

@FrankD412
Copy link
Copy Markdown
Collaborator

lgtm!

…aded config

The Qwen3HybridConfig.set_values_if_none validator called
load_pretrained_config(self.name, ...) where self.name is the bench alias
(e.g. qwen3.5_9b_hf), not an actual path. Transformers then tried to
resolve it as a hub repo id and failed with a 401/OSError.

Move the layer_types derivation into from_hf, where the pretrained_config
is already loaded from the correct hf_model_path, and pass the derived
num_attention_layers/num_linear_attention_layers through the constructor.

Signed-off-by: tensorrt-cicd <[email protected]>
@longlee0622 longlee0622 force-pushed the repair-bot-bug6141606 branch from 68ad06d to 353569d Compare May 21, 2026 07:31
@longlee0622 longlee0622 enabled auto-merge (squash) May 21, 2026 07:32
@longlee0622
Copy link
Copy Markdown
Collaborator

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator Author

PR_Github #49637 [ run ] triggered by Bot. Commit: 353569d Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator Author

PR_Github #49637 [ run ] completed with state SUCCESS. Commit: 353569d
/LLM/main/L0_MergeRequest_PR pipeline #39250 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@longlee0622
Copy link
Copy Markdown
Collaborator

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator Author

PR_Github #49817 [ run ] triggered by Bot. Commit: 353569d Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator Author

PR_Github #49817 [ run ] completed with state SUCCESS. Commit: 353569d
/LLM/main/L0_MergeRequest_PR pipeline #39403 completed with status: 'SUCCESS'

CI Report

Link to invocation

@longlee0622 longlee0622 merged commit 6dfaf28 into NVIDIA:main May 22, 2026
8 checks passed
KleinBlueC pushed a commit to KleinBlueC/TensorRT-LLM that referenced this pull request May 26, 2026
bmarimuthu-nv pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants