Skip to content

[https://nvbugs/6115832][fix] Fix aiohttp 3.13 streaming ValueError in benchmark client#13952

Merged
chenfeiz0326 merged 6 commits into
NVIDIA:mainfrom
chenfeiz0326:chenfeiz/fix-one-failed-request
May 12, 2026
Merged

[https://nvbugs/6115832][fix] Fix aiohttp 3.13 streaming ValueError in benchmark client#13952
chenfeiz0326 merged 6 commits into
NVIDIA:mainfrom
chenfeiz0326:chenfeiz/fix-one-failed-request

Conversation

@chenfeiz0326
Copy link
Copy Markdown
Collaborator

@chenfeiz0326 chenfeiz0326 commented May 10, 2026

Summary

Replace async for chunk in response_content: with async for chunk in response_content.iter_any(): in _iter_sse_data() to avoid aiohttp 3.13.3's 128KB readuntil() buffer limit that raises
ValueError("Chunk too big") on large SSE payloads.

Root Cause

aiohttp 3.13+ added a hard 128KB limit to StreamReader.readuntil(b'\n'), which is used internally when iterating response.content line-by-line. Long-context models (Kimi-K2.5, DeepSeek-V3.2) produce SSE
data lines exceeding 128KB, triggering the error.

Fix

iter_any() calls readany() which has no size limit and returns bytes as they arrive. The existing _iter_sse_data buffering logic correctly reassembles arbitrary byte chunks into complete
newline-delimited SSE lines before JSON parsing.

Validation

  • kimi-1k1k (GB200, Lyris): 8/8 runs, 0 failed requests (163,840 total)
  • kimi-8k1k (GB200, Lyris): 3/3 runs, 0 failed requests
  • deepseek-v32-32k4k (GB200, Lyris): 3/3 runs, 0 failed requests
  • kimi-k25-thinking (B200, Prenyx): 1/1 run, 0 failed requests

PR Checklist

Please review the following before submitting your PR:

  • PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.

  • PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.

  • Test cases are provided for new code paths (see test instructions)

  • Any new dependencies have been scanned for license and vulnerabilities

  • CODEOWNERS updated if ownership changes

  • Documentation updated as needed

  • Update tava architecture diagram if there is a significant design change in PR.

  • The reviewers assigned automatically/manually are appropriate for the PR.

  • Please check this after reviewing the above items as appropriate for this PR.

GitHub Bot Help

To see a list of available CI bot commands, please comment /bot help.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 10, 2026

📝 Walkthrough

Walkthrough

The PR updates SSE stream iteration to use iter_any() for consuming response chunks, and strengthens performance test failure detection by raising immediately when any requests fail instead of applying tolerance thresholds.

Changes

Streaming Parser and Performance Validation

Layer / File(s) Summary
SSE Stream Parsing
tensorrt_llm/serve/scripts/backend_request_func.py
SSE byte-stream parser _iter_sse_data changed to iterate chunks via response_content.iter_any() instead of direct iteration; buffering and payload extraction logic unchanged.
Performance Test Validation
tests/integration/defs/perf/test_perf_sanity.py
Perf sanity error handling now raises immediately when Failed requests > 0 instead of comparing against a tolerance threshold; error messages simplified to report failed/successful/total counts without threshold text; minor adjustment to ServerConfig.to_match_keys() matching logic.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly identifies the main fix: replacing streaming iteration in aiohttp 3.13 to resolve ValueError issues in the benchmark client, directly matching the changeset's primary modification.
Docstring Coverage ✅ Passed Docstring coverage is 80.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description check ✅ Passed The PR description provides a clear summary of the changes, root cause analysis, the fix applied, and validation results across multiple benchmarks.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
tensorrt_llm/serve/scripts/backend_request_func.py (1)

1-3: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add the required NVIDIA copyright header for this modified Python file.

This file was modified, but it does not include the required NVIDIA copyright header with the current modification year (2026).

As per coding guidelines "Include NVIDIA copyright header on all new files; update year on modified files" and "All C++, Python, and other source files must contain NVIDIA copyright header with current modification year".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tensorrt_llm/serve/scripts/backend_request_func.py` around lines 1 - 3, This
file tensorrt_llm/serve/scripts/backend_request_func.py is missing the required
NVIDIA copyright header for modified files; add the official NVIDIA copyright
header at the top of backend_request_func.py and update the modification year to
2026, ensuring the header follows the project’s standard format for Python
source files and appears before any code or comments (preserve the existing
Adopted from comment below the header).
tests/integration/defs/perf/test_perf_sanity.py (1)

1-1: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the NVIDIA copyright year for this modified file.

Line 1 still ends at 2025, but this file is modified in this PR and should include 2026.

As per coding guidelines: "Include NVIDIA copyright header on all new files; update year on modified files" and "All C++, Python, and other source files must contain NVIDIA copyright header with current modification year".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/integration/defs/perf/test_perf_sanity.py` at line 1, Update the
copyright header year in the file by changing the top SPDX header line that
currently reads "Copyright (c) 2022-2025 NVIDIA CORPORATION & AFFILIATES. All
rights reserved." to include 2026 (e.g., "2022-2026") so the file's
modified-year matches the current PR; ensure the first line/header in
tests/integration/defs/perf/test_perf_sanity.py is updated accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@tensorrt_llm/serve/scripts/backend_request_func.py`:
- Around line 1-3: This file tensorrt_llm/serve/scripts/backend_request_func.py
is missing the required NVIDIA copyright header for modified files; add the
official NVIDIA copyright header at the top of backend_request_func.py and
update the modification year to 2026, ensuring the header follows the project’s
standard format for Python source files and appears before any code or comments
(preserve the existing Adopted from comment below the header).

In `@tests/integration/defs/perf/test_perf_sanity.py`:
- Line 1: Update the copyright header year in the file by changing the top SPDX
header line that currently reads "Copyright (c) 2022-2025 NVIDIA CORPORATION &
AFFILIATES. All rights reserved." to include 2026 (e.g., "2022-2026") so the
file's modified-year matches the current PR; ensure the first line/header in
tests/integration/defs/perf/test_perf_sanity.py is updated accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 00d2b128-a6e3-426f-a799-516fc44b9d55

📥 Commits

Reviewing files that changed from the base of the PR and between a31d650 and 94e7634.

📒 Files selected for processing (2)
  • tensorrt_llm/serve/scripts/backend_request_func.py
  • tests/integration/defs/perf/test_perf_sanity.py

@chenfeiz0326 chenfeiz0326 requested a review from ruodil May 10, 2026 05:07
@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --stage-list "*GB200*PerfSanity*"

@chenfeiz0326 chenfeiz0326 requested a review from longlee0622 May 10, 2026 05:10
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47560 [ run ] triggered by Bot. Commit: 0b8abb1 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47560 [ run ] completed with state FAILURE. Commit: 0b8abb1

Link to invocation

@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --stage-list "*GB200*PerfSanity*"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47582 [ run ] triggered by Bot. Commit: 0b8abb1 Link to invocation

@chenfeiz0326 chenfeiz0326 force-pushed the chenfeiz/fix-one-failed-request branch from 0b8abb1 to 9bd46fe Compare May 10, 2026 12:38
@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast --stage-list "*GB200*PerfSanity*"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47601 [ run ] triggered by Bot. Commit: 9bd46fe Link to invocation

@chenfeiz0326 chenfeiz0326 requested a review from Superjomn May 11, 2026 02:54
Copy link
Copy Markdown
Collaborator

@Superjomn Superjomn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Chenfei Zhang <[email protected]>
@chenfeiz0326 chenfeiz0326 force-pushed the chenfeiz/fix-one-failed-request branch from 9bd46fe to f65ecb8 Compare May 11, 2026 03:30
Signed-off-by: Chenfei Zhang <[email protected]>
@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47654 [ run ] triggered by Bot. Commit: 43a1ae9 Link to invocation

@chenfeiz0326 chenfeiz0326 enabled auto-merge (squash) May 11, 2026 13:45
@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47654 [ run ] completed with state SUCCESS. Commit: 43a1ae9
/LLM/main/L0_MergeRequest_PR pipeline #37558 completed with status: 'FAILURE'

CI Report

⚠️ Action Required:

  • Please check the failed tests and fix your PR
  • If you cannot view the failures, ask the CI triggerer to share details
  • Once fixed, request an NVIDIA team member to trigger CI again

CI Agent Failure Analysis

Link to invocation

@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot run --disable-fail-fast

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47837 [ run ] triggered by Bot. Commit: 43a1ae9 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47837 [ run ] completed with state SUCCESS. Commit: 43a1ae9
/LLM/main/L0_MergeRequest_PR pipeline #37719 completed with status: 'SUCCESS'

CI Report

Link to invocation

@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "Pre-merge has passed. Only resolve waives.txt conflict, no need to run the whole CI pipeline again"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47901 [ skip ] triggered by Bot. Commit: c770ec8 Link to invocation

@chenfeiz0326
Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "Pre-merge has passed. Only resolve waives.txt conflict, no need to run the whole CI pipeline again"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47901 [ skip ] completed with state SUCCESS. Commit: c770ec8
Skipping testing for commit c770ec8

Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47908 [ skip ] triggered by Bot. Commit: b8f5d16 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #47908 [ skip ] completed with state SUCCESS. Commit: b8f5d16
Skipping testing for commit b8f5d16

Link to invocation

@chenfeiz0326 chenfeiz0326 merged commit dd64912 into NVIDIA:main May 12, 2026
6 checks passed
yufeiwu-nv pushed a commit to yufeiwu-nv/TensorRT-LLM that referenced this pull request May 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants