Skip to content

[None][fix] Handle unset attention_dp_relax in ADP routers#14276

Merged
peihu-nv merged 2 commits into
NVIDIA:mainfrom
peihu-nv:fix/adp-relax-none
May 20, 2026
Merged

[None][fix] Handle unset attention_dp_relax in ADP routers#14276
peihu-nv merged 2 commits into
NVIDIA:mainfrom
peihu-nv:fix/adp-relax-none

Conversation

@peihu-nv
Copy link
Copy Markdown
Collaborator

@peihu-nv peihu-nv commented May 18, 2026

Summary by CodeRabbit

  • Bug Fixes

    • Fixed request scheduling to correctly treat unset parameters as permitting relaxed routing mode, ensuring proper request distribution under capacity constraints.
  • Tests

    • Added unit tests verifying request routing behavior with different parameter configurations.

Review Change Stack

Description

Fix an ADP router crash when a request has SchedulingParams, but attention_dp_relax is unset and causing NoneType comparison. Hit this issue when doing perfwork for DS V4 using inferenceX benchmarking script but both main and DS V4 branch will have this issue.

SchedulingParams.attention_dp_relax is typed as Optional[bool] and defaults to None.

The ADP routers were using this field directly as the sort key:

return scheduling_params.attention_dp_relax

This was previously mostly hidden for OpenAI serving because requests often had no py_scheduling_params, so the router took the safe path:

  if scheduling_params is None:
      return True

After a31d650 / #11173, OpenAI chat and harmony paths create SchedulingParams(agent_hierarchy=...) and pass it into generation. That means scheduling_params is present, but attention_dp_relax remains None. With multiple such requests, Python sorting can compare None values and fail with:
TypeError: '<' not supported between instances of 'NoneType' and 'NoneType'

Fix:
Treat only an explicit False as strict and keep None aligned with the existing missing-scheduling-params behavior. Apply the same logic to both DefaultADPRouter and KVCacheAwareADPRouter, with regression coverage for both paths.

Test Coverage

Added regression coverage for both ADP router implementations where one request has attention_dp_relax=None and another has attention_dp_relax=False.

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • If PR introduces API changes, an appropriate PR label is added - either api-compatible or api-breaking. For api-breaking, include BREAKING in the PR title.

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

OpenAI requests can carry a SchedulingParams object while leaving attention_dp_relax unset. The ADP routers used that field directly as the sort key, which makes mixed None and False requests fail when Python tries to compare None with bool values.

Treat only an explicit False as strict and keep None aligned with the existing missing-scheduling-params behavior. Apply the same logic to both DefaultADPRouter and KVCacheAwareADPRouter, with regression coverage for both paths.

Signed-off-by: peihengh <[email protected]>
@peihu-nv peihu-nv requested a review from a team as a code owner May 18, 2026 21:26
@peihu-nv peihu-nv requested a review from joyang-nv May 18, 2026 21:26
@peihu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 18, 2026

📝 Walkthrough

Walkthrough

The PR normalizes the attention_dp_relax scheduling parameter to boolean across two router scheduler classes. Both DefaultADPRouter and KVCacheAwareADPRouter now treat None and any non-False value as "relaxed" for request ordering. New unit tests verify this behavior under capacity constraints.

Changes

Attention DP Relax Boolean Normalization

Layer / File(s) Summary
Relax-value boolean normalization in router schedulers
tensorrt_llm/_torch/pyexecutor/scheduler/adp_router.py
DefaultADPRouter and KVCacheAwareADPRouter both update their get_relax_value methods to return attention_dp_relax is not False instead of the raw value, normalizing the relax flag to boolean.
Test coverage for None relax-value behavior
tests/unittest/_torch/executor/test_adp_router.py
New test cases verify that attention_dp_relax=None is treated as relaxed (non-strict) when scheduling under capacity constraints, for both DefaultADPRouter and KVCacheAwareADPRouter.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely identifies the main change: handling unset (None) attention_dp_relax in ADP routers, which is the core fix in this PR.
Description check ✅ Passed The PR description is comprehensive and complete. It explains the issue, root cause, solution, and test coverage. All required template sections are filled out and the checklist is confirmed.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
tests/unittest/_torch/executor/test_adp_router.py (1)

206-221: QA list updates are unnecessary for this PR.

This change is unit-scope regression coverage plus router logic; no integration/perf QA test-list entry updates are needed.

As per coding guidelines, “If the PR only touches unittest/ or narrow unit scope, say explicitly whether QA list updates are unnecessary or optional.”

Also applies to: 945-956

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/unittest/_torch/executor/test_adp_router.py` around lines 206 - 221,
Add an explicit note to the PR description and commit message stating that QA
list updates are unnecessary because this change only touches unit tests (e.g.,
the test_none_attention_dp_relax_is_relaxed test) and router logic in
DefaultADPRouter/RankState/_make_request_item/route_requests; also apply the
same explicit note to other similar test-only changes in this PR so reviewers
know QA list edits are not required.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@tests/unittest/_torch/executor/test_adp_router.py`:
- Around line 206-221: Add an explicit note to the PR description and commit
message stating that QA list updates are unnecessary because this change only
touches unit tests (e.g., the test_none_attention_dp_relax_is_relaxed test) and
router logic in DefaultADPRouter/RankState/_make_request_item/route_requests;
also apply the same explicit note to other similar test-only changes in this PR
so reviewers know QA list edits are not required.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 7634e2e2-905e-4070-ad26-e1a91c0b4aa7

📥 Commits

Reviewing files that changed from the base of the PR and between 355ba94 and 4fca6c1.

📒 Files selected for processing (2)
  • tensorrt_llm/_torch/pyexecutor/scheduler/adp_router.py
  • tests/unittest/_torch/executor/test_adp_router.py

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48982 [ run ] triggered by Bot. Commit: 4fca6c1 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #48982 [ run ] completed with state SUCCESS. Commit: 4fca6c1
/LLM/main/L0_MergeRequest_PR pipeline #38727 completed with status: 'SUCCESS'

CI Report

Link to invocation

Comment thread tensorrt_llm/_torch/pyexecutor/scheduler/adp_router.py Outdated
Make the relaxed ADP routing default explicit on SchedulingParams instead of encoding it in router sort-key logic. This keeps requests that only carry agent hierarchy scheduling params aligned with the existing missing-scheduling-params behavior.

Revert the routers to direct attention_dp_relax field reads and update the regression tests to cover the explicit SchedulingParams default for both DefaultADPRouter and KVCacheAwareADPRouter.

Signed-off-by: peihengh <[email protected]>
@peihu-nv
Copy link
Copy Markdown
Collaborator Author

/bot run

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49242 [ run ] triggered by Bot. Commit: 27a14c4 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #49242 [ run ] completed with state SUCCESS. Commit: 27a14c4
/LLM/main/L0_MergeRequest_PR pipeline #38913 completed with status: 'SUCCESS'

CI Report

Link to invocation

@peihu-nv peihu-nv merged commit 3de344d into NVIDIA:main May 20, 2026
7 checks passed
lancelly added a commit that referenced this pull request May 22, 2026
… ADP routers (#14431)

Signed-off-by: peihengh <[email protected]>
Signed-off-by: Lance Liao <[email protected]>
Co-authored-by: peihengh <[email protected]>
xxi-nv pushed a commit to xxi-nv/TensorRT-LLM that referenced this pull request May 22, 2026
bmarimuthu-nv pushed a commit to nv-auto-deploy/TensorRT-LLM that referenced this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants