[None][chore] Better Empty File Error Handling for trtllm-bench#12552
Conversation
📝 WalkthroughWalkthroughA validation check was added to the Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~5 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tensorrt_llm/bench/utils/data.py (1)
1-1:⚠️ Potential issue | 🟠 MajorAdd/update the NVIDIA source header in this modified Python file.
This file was modified, but there is no visible NVIDIA copyright header at the top. Please add/update the standard project header with the latest meaningful modification year.
As per coding guidelines "
**/*.{cpp,cc,cxx,h,hpp,py}: Add NVIDIA copyright header on ALL new files, and update year on modified files."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/bench/utils/data.py` at line 1, This file is missing the required NVIDIA source header; update the top of tensorrt_llm.bench.utils.data (module data.py) by inserting the project's standard NVIDIA copyright/header block (matching other Python files in the repo) and ensure the copyright year is updated to the latest meaningful modification year; place the header before any imports (above the existing "import json") and follow the same formatting used in other .py files in the project.
🧹 Nitpick comments (1)
tensorrt_llm/bench/utils/data.py (1)
124-130: Add a regression test for the empty-stream path.Please add a unit/integration test that passes an empty dataset stream and asserts the failure path/message introduced at Lines 125-130.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tensorrt_llm/bench/utils/data.py` around lines 124 - 130, Add a regression test that simulates an empty dataset stream and asserts the ValueError path when no prompts are read: import the reader function from tensorrt_llm.bench.utils.data (the function that consumes a dataset stream and populates the local variable prompts), call it with an empty io.StringIO or empty bytes stream, and use pytest.raises(ValueError, match="No data was read from the dataset stream") to verify the exact failure message produced by the len(prompts) == 0 branch; ensure the test name clearly indicates it covers the empty-stream/error path and runs as part of the test suite.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tensorrt_llm/bench/utils/data.py`:
- Around line 124-130: The new ValueError raised when no prompts are read from
create_dataset_from_stream will surface as a raw traceback because callers in
benchmark modules don't catch it; replace this direct ValueError with a
specific, catchable exception (e.g., create a DatasetLoadError or
DatasetFormatError class) and raise that instead from
create_dataset_from_stream, then update the benchmark entry points that call
create_dataset_from_stream to catch DatasetLoadError and translate it to the
CLI-friendly error handling already used (or move the create_dataset_from_stream
call inside the existing try/except blocks); reference
create_dataset_from_stream and the new DatasetLoadError/DatasetFormatError so
the caller changes can locate and handle the error type.
---
Outside diff comments:
In `@tensorrt_llm/bench/utils/data.py`:
- Line 1: This file is missing the required NVIDIA source header; update the top
of tensorrt_llm.bench.utils.data (module data.py) by inserting the project's
standard NVIDIA copyright/header block (matching other Python files in the repo)
and ensure the copyright year is updated to the latest meaningful modification
year; place the header before any imports (above the existing "import json") and
follow the same formatting used in other .py files in the project.
---
Nitpick comments:
In `@tensorrt_llm/bench/utils/data.py`:
- Around line 124-130: Add a regression test that simulates an empty dataset
stream and asserts the ValueError path when no prompts are read: import the
reader function from tensorrt_llm.bench.utils.data (the function that consumes a
dataset stream and populates the local variable prompts), call it with an empty
io.StringIO or empty bytes stream, and use pytest.raises(ValueError, match="No
data was read from the dataset stream") to verify the exact failure message
produced by the len(prompts) == 0 branch; ensure the test name clearly indicates
it covers the empty-stream/error path and runs as part of the test suite.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2cf7ab9f-fbd5-4f65-a126-7af8d963ef41
📒 Files selected for processing (1)
tensorrt_llm/bench/utils/data.py
|
/bot run |
|
PR_Github #40796 [ run ] triggered by Bot. Commit: |
|
PR_Github #40796 [ run ] completed with state
|
|
/bot run |
|
PR_Github #41014 [ run ] triggered by Bot. Commit: |
|
PR_Github #41014 [ run ] completed with state
|
|
/bot run |
|
PR_Github #41038 [ run ] triggered by Bot. Commit: |
ccfeccc to
1402d60
Compare
|
PR_Github #41038 [ run ] completed with state
|
1402d60 to
93e0248
Compare
|
/bot run |
|
PR_Github #41231 [ run ] triggered by Bot. Commit: |
|
PR_Github #41231 [ run ] completed with state |
93e0248 to
150268d
Compare
|
/bot run |
|
PR_Github #41479 [ run ] triggered by Bot. Commit: |
|
PR_Github #41479 [ run ] completed with state
|
b7d4ba7 to
4376d79
Compare
|
/bot run |
|
PR_Github #41518 [ run ] triggered by Bot. Commit: |
|
PR_Github #41518 [ run ] completed with state
|
|
/bot run |
307609a to
8df6174
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #43820 [ run ] triggered by Bot. Commit: |
|
PR_Github #43820 [ run ] completed with state
|
8df6174 to
3f04260
Compare
|
/bot run |
|
PR_Github #44775 [ run ] triggered by Bot. Commit: |
3f04260 to
7c35b45
Compare
|
/bot run |
|
PR_Github #44782 [ run ] triggered by Bot. Commit: |
|
PR_Github #44782 [ run ] completed with state
|
|
/bot run |
Signed-off-by: Yijing Li <[email protected]>
…ror handling - Add NVIDIA copyright header to bench/utils/data.py - Replace ValueError with DatasetFormatError (subclass of ValueError) for clearer, catchable exception semantics - Wrap create_dataset_from_stream calls in low_latency.py and throughput.py to catch DatasetFormatError and re-raise as click.UsageError for clean CLI error output instead of raw tracebacks - Add regression test for the empty-stream path in test_bench_data.py Signed-off-by: Yijing Li <[email protected]> Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Signed-off-by: Yijing Li <[email protected]> Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Signed-off-by: Yijing Li <[email protected]>
Add NVIDIA Apache-2.0 license headers to files modified in this PR that were missing them, to fix the OSS vulnerability scan. Signed-off-by: Yijing Li <[email protected]>
7c35b45 to
553c302
Compare
|
PR_Github #44823 [ run ] triggered by Bot. Commit: |
|
/bot run |
|
PR_Github #44827 [ run ] triggered by Bot. Commit: |
|
PR_Github #44827 [ run ] completed with state
|
|
/bot run |
|
PR_Github #44992 [ run ] triggered by Bot. Commit: |
|
PR_Github #44992 [ run ] completed with state
|
|
/bot run |
|
PR_Github #45038 [ run ] triggered by Bot. Commit: |
|
PR_Github #45038 [ run ] completed with state |
…IA#12552) Signed-off-by: Yijing Li <[email protected]> Co-authored-by: Claude Sonnet 4.6 <[email protected]>
Description
If the input dataset file happens to be empty due to process killed, trtllm-bench will throw a seemingly confusing error. To help user debug, add an early empty dataset check and throw a meaningful error in case the input dataset is empty.
Edit:
During CI debugging, Claude found a unit test failure at
tests/unittest/bindings/test_transfer_agent_bindings.pyand fixed the test by using the new agent binding interface. For simplicity, the fix is incorporated into this PR as well.Test Coverage
No new test added.
PR Checklist
Please review the following before submitting your PR:
PR description clearly explains what and why. If using CodeRabbit's summary, please make sure it makes sense.
PR Follows TRT-LLM CODING GUIDELINES to the best of your knowledge.
Test cases are provided for new code paths (see test instructions)
Any new dependencies have been scanned for license and vulnerabilities
CODEOWNERS updated if ownership changes
Documentation updated as needed
Update tava architecture diagram if there is a significant design change in PR.
The reviewers assigned automatically/manually are appropriate for the PR.
Please check this after reviewing the above items as appropriate for this PR.
GitHub Bot Help
To see a list of available CI bot commands, please comment
/bot help.Summary by CodeRabbit
Release Notes