[None][doc] fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11#12338
Conversation
📝 WalkthroughWalkthroughDocumentation updates across six blog posts standardizing CLI configuration handling and correcting references. Key changes include replacing Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Comment Tip CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.OpenGrep is compatible with Semgrep configurations. Add an |
There was a problem hiding this comment.
Actionable comments posted: 6
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md (2)
624-636:⚠️ Potential issue | 🔴 CriticalCRITICAL: Coding guideline violation - prefer
--configover--extra_llm_api_options.This EPLB benchmark command also uses
--extra_llm_api_optionswithtrtllm-bench, violating the same coding guideline.As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like
trtllm-serve,trtllm-bench,trtllm-eval, prefer using--configover--extra_llm_api_optionsfor specifying configuration files."📋 Proposed fix to align with coding guidelines
-cat > ./extra_llm_api_options_eplb.yaml <<EOF +cat > ./config_eplb.yaml <<EOF enable_attention_dp: true moe_config: load_balancer: ./moe_load_balancer.yaml EOF trtllm-llmapi-launch \ trtllm-bench --model ${MODEL_NAME} \ --model_path ${MODEL_PATH} \ throughput \ --tp 36 \ --ep 36 \ - --extra_llm_api_options ./extra_llm_api_options_eplb.yaml \ + --config ./config_eplb.yaml \ --kv_cache_free_gpu_mem_fraction 0.75 \ --backend pytorch \ --dataset ./dataset.json \ --warmup 0 \ --eos_id -1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md` around lines 624 - 636, Replace the use of the deprecated flag in the documented benchmark invocation: in the trtllm-bench command that currently passes --extra_llm_api_options ./extra_llm_api_options_eplb.yaml, change it to use --config ./extra_llm_api_options_eplb.yaml so the invocation follows the coding guideline; update the surrounding example lines where trtllm-bench and the extra_llm_api_options_eplb.yaml token appear to reflect the --config flag consistently.
544-554:⚠️ Potential issue | 🔴 CriticalCRITICAL: Coding guideline violation - prefer
--configover--extra_llm_api_options.This baseline benchmark command uses
--extra_llm_api_optionswithtrtllm-bench, violating the coding guidelines.As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like
trtllm-serve,trtllm-bench,trtllm-eval, prefer using--configover--extra_llm_api_optionsfor specifying configuration files."📋 Proposed fix to align with coding guidelines
-cat > ./extra_llm_api_options.yaml <<EOF +cat > ./config.yaml <<EOF enable_attention_dp: true EOF trtllm-llmapi-launch \ trtllm-bench --model ${MODEL_NAME} \ --model_path ${MODEL_PATH} \ throughput \ --tp 32 \ --ep 32 \ - --extra_llm_api_options ./extra_llm_api_options.yaml \ + --config ./config.yaml \ --kv_cache_free_gpu_mem_fraction 0.75 \ --backend pytorch \ --dataset ./dataset.json \ --warmup 0 \ --eos_id -1🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md` around lines 544 - 554, The CLI example uses the deprecated flag --extra_llm_api_options with trtllm-bench; change it to use the canonical --config flag and update the referenced YAML filename accordingly (e.g., create a config file like extra_llm_api_options.yaml or rename it to extra_llm_config.yaml) and pass it to trtllm-bench as --config <file>, ensuring the surrounding invocation (trtllm-llmapi-launch and trtllm-bench --model ${MODEL_NAME} --model_path ${MODEL_PATH} throughput --tp 32 --ep 32) remains unchanged except for replacing --extra_llm_api_options ./extra_llm_api_options.yaml with --config ./<updated_filename>.yaml so the example conforms to the coding guideline.docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md (2)
125-145:⚠️ Potential issue | 🔴 CriticalCRITICAL: Coding guideline violation - prefer
--configover--extra_llm_api_options.This section uses
--extra_llm_api_optionswithtrtllm-bench, but the repository coding guidelines explicitly state: "When documenting CLI commands for TensorRT-LLM tools liketrtllm-serve,trtllm-bench,trtllm-eval, prefer using--configover--extra_llm_api_optionsfor specifying configuration files."The YAML filename and flag should be reverted to use
--configandconfig.ymlrespectively.As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like
trtllm-serve,trtllm-bench,trtllm-eval, prefer using--configover--extra_llm_api_optionsfor specifying configuration files."📋 Proposed fix to align with coding guidelines
-cat >./extra-llm-api-config.yml<<EOF +cat >./config.yml<<EOF cuda_graph_config: {} moe_config: backend: TRTLLM speculative_config: decoding_type: MTP num_nextn_predict_layers: 3 EOF export TRTLLM_ENABLE_PDL=1 trtllm-bench --model nvidia/DeepSeek-R1-FP4 \ throughput \ --dataset $YOUR_DATA_PATH \ --backend pytorch \ --num_requests 10 \ --concurrency 1 \ --max_batch_size 1 \ --tp 8 \ --ep 2 \ - --extra_llm_api_options ./extra-llm-api-config.yml + --config ./config.yml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md` around lines 125 - 145, The CLI example uses the deprecated flag and filename; replace the --extra_llm_api_options flag and extra-llm-api-config.yml with the canonical --config flag and config.yml: create the YAML as config.yml (previously extra-llm-api-config.yml) and update the trtllm-bench invocation to use --config ./config.yml; locate the trtllm-bench command and the mention of --extra_llm_api_options in the file and change those symbols accordingly.
181-204:⚠️ Potential issue | 🔴 CriticalCRITICAL: Coding guideline violation - prefer
--configover--extra_llm_api_options.This section also uses
--extra_llm_api_optionswithtrtllm-bench, violating the same coding guideline as the previous section.As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like
trtllm-serve,trtllm-bench,trtllm-eval, prefer using--configover--extra_llm_api_optionsfor specifying configuration files."📋 Proposed fix to align with coding guidelines
-cat >./extra-llm-api-config.yml<<EOF +cat >./config.yml<<EOF cuda_graph_config: {} moe_config: backend: TRTLLM speculative_config: decoding_type: MTP num_nextn_predict_layers: 3 use_relaxed_acceptance_for_thinking: true relaxed_topk: 10 relaxed_delta: 0.6 EOF export TRTLLM_ENABLE_PDL=1 trtllm-bench --model nvidia/DeepSeek-R1-FP4 \ throughput \ --dataset $YOUR_DATA_PATH \ --backend pytorch \ --num_requests 10 \ --concurrency 1 \ --max_batch_size 1 \ --tp 8 \ --ep 2 \ - --extra_llm_api_options ./extra-llm-api-config.yml + --config ./config.yml🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md` around lines 181 - 204, Replace the --extra_llm_api_options usage with the canonical --config flag in the trtllm-bench invocation: keep the same config file (extra-llm-api-config.yml) and environment setup (TRTLLM_ENABLE_PDL) and change the command argument from --extra_llm_api_options ./extra-llm-api-config.yml to --config ./extra-llm-api-config.yml so the trtllm-bench call (the trtllm-bench throughput command) follows the project CLI guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md`:
- Line 109: The documented trtllm-serve invocation uses the discouraged flag
`--extra_llm_api_options`; update the command to use the preferred `--config`
flag instead (e.g., replace `--extra_llm_api_options
/config/models/eagle/eagle.yaml` with `--config
/config/models/eagle/eagle.yaml`) while keeping the rest of the invocation
(TRTLLM_ENABLE_PDL, trtllm-serve, model path, host/port, batching and token
options, and trust_remote_code) unchanged so it follows the TensorRT-LLM CLI
coding guideline.
In
`@docs/source/blogs/tech_blog/blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md`:
- Line 32: Update the GitHub repository reference in the blog text: replace the
URL and display name "NVIDIA/TensorRT-Model-Optimizer" with the canonical
"NVIDIA/Model-Optimizer" wherever it appears (the sentence referencing the Model
Optimizer in the DeepSeek checkpoint description) so the link points to and
displays the correct repository name.
In `@docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md`:
- Line 89: The trtllm-bench command in the doc uses the deprecated flag
--extra_llm_api_options; update the CLI example to use --config instead (i.e.,
replace --extra_llm_api_options low_latency.yaml with --config low_latency.yaml)
so the documented command for trtllm-bench conforms to the coding guideline;
ensure any mention of the old flag is removed and the file low_latency.yaml
remains the referenced config.
- Line 204: The documented CLI uses the disallowed flag --extra_llm_api_options
in the trtllm-serve example; change that invocation to use --config instead,
pointing it at the same YAML (max_throughput.yaml) so the command follows the
coding guideline; update any mention of --extra_llm_api_options in the
trtllm-serve example to --config and ensure the surrounding text still describes
loading the same configuration file.
- Line 152: The doc uses the deprecated flag --extra_llm_api_options in the
trtllm-bench max-throughput command; update the command sample to use --config
instead (replace the --extra_llm_api_options max_throughput.yaml occurrence with
--config max_throughput.yaml) so it follows the coding guideline for
TensorRT-LLM CLI tools (affecting the max-throughput trtllm-bench example in the
blog9_Deploying_GPT_OSS_on_TRTLLM.md file).
- Line 187: The command snippet uses the deprecated/forbidden flag
--extra_llm_api_options; update the documented trtllm-serve invocation to use
--config instead and point it to the same YAML file (e.g., replace the
--extra_llm_api_options max_throughput.yaml token with --config
max_throughput.yaml) so the documentation aligns with the coding guideline
preferring --config for TensorRT-LLM CLI configuration files.
---
Outside diff comments:
In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`:
- Around line 125-145: The CLI example uses the deprecated flag and filename;
replace the --extra_llm_api_options flag and extra-llm-api-config.yml with the
canonical --config flag and config.yml: create the YAML as config.yml
(previously extra-llm-api-config.yml) and update the trtllm-bench invocation to
use --config ./config.yml; locate the trtllm-bench command and the mention of
--extra_llm_api_options in the file and change those symbols accordingly.
- Around line 181-204: Replace the --extra_llm_api_options usage with the
canonical --config flag in the trtllm-bench invocation: keep the same config
file (extra-llm-api-config.yml) and environment setup (TRTLLM_ENABLE_PDL) and
change the command argument from --extra_llm_api_options
./extra-llm-api-config.yml to --config ./extra-llm-api-config.yml so the
trtllm-bench call (the trtllm-bench throughput command) follows the project CLI
guideline.
In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`:
- Around line 624-636: Replace the use of the deprecated flag in the documented
benchmark invocation: in the trtllm-bench command that currently passes
--extra_llm_api_options ./extra_llm_api_options_eplb.yaml, change it to use
--config ./extra_llm_api_options_eplb.yaml so the invocation follows the coding
guideline; update the surrounding example lines where trtllm-bench and the
extra_llm_api_options_eplb.yaml token appear to reflect the --config flag
consistently.
- Around line 544-554: The CLI example uses the deprecated flag
--extra_llm_api_options with trtllm-bench; change it to use the canonical
--config flag and update the referenced YAML filename accordingly (e.g., create
a config file like extra_llm_api_options.yaml or rename it to
extra_llm_config.yaml) and pass it to trtllm-bench as --config <file>, ensuring
the surrounding invocation (trtllm-llmapi-launch and trtllm-bench --model
${MODEL_NAME} --model_path ${MODEL_PATH} throughput --tp 32 --ep 32) remains
unchanged except for replacing --extra_llm_api_options
./extra_llm_api_options.yaml with --config ./<updated_filename>.yaml so the
example conforms to the coding guideline.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fbd180dd-1532-497c-958f-18a693f8c179
📒 Files selected for processing (6)
docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.mddocs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.mddocs/source/blogs/tech_blog/blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.mddocs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.mddocs/source/blogs/tech_blog/blog8_Scaling_Expert_Parallelism_in_TensorRT-LLM_part2.mddocs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md
9adca18 to
d8b143b
Compare
|
/bot skip --comment "Doc change" |
|
PR_Github #39639 [ skip ] triggered by Bot. Commit: |
|
PR_Github #39639 [ skip ] completed with state |
Automated review of tech blog accuracy against current codebase found the following issues, now corrected: blog2 (MTP): - Fix CLI flag: --spec_decode_nextn -> --spec_decode_max_draft_len (x2) - Fix config file flag: --config -> --extra_llm_api_options (x2) - Rename config file examples to extra-llm-api-config.yml for clarity blog3 (Throughput): - Fix broken API link: pyexecutor/config.py -> llmapi/llm_args.py#L102 (CudaGraphConfig moved to llm_args.py) - Fix NVIDIA Model Optimizer GitHub URL blog4 (WideEP Part 1): - Fix two broken feat/large-ep branch links -> main branch - Fix all ep_load_balancer/ script paths -> wide_ep/ep_load_balancer/ - Fix --config flag -> --extra_llm_api_options in benchmark commands blog8 (WideEP Part 2): - Note that examples/wide_ep/slurm_scripts is not yet public - Fix perf-analysis doc path blog9 (GPT-OSS): - Fix container image tag text to match actual docker command (1.1.0rc1) - Fix --config flag -> --extra_llm_api_options throughout blog11 (GPT-OSS Eagle3): - Replace pinned RC image tag with NGC catalog reference - Fix use_torch_sampler -> sampler_type: TorchSampler (current API) - Fix decoding_type: Eagle3 -> decoding_type: Eagle - Fix --config -> --extra_llm_api_options in serve command Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: schetlur <[email protected]>
--config is a valid alias for --extra_llm_api_options in trtllm-bench, trtllm-serve, and trtllm-eval. Both are accepted by the CLI and neither is deprecated. Revert the unnecessary alias substitutions introduced in the previous commit, keeping only the substantively correct fixes. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: schetlur <[email protected]>
NVIDIA/TensorRT-Model-Optimizer is a redirect; the canonical repository name is NVIDIA/Model-Optimizer. Restore the original URL from the blog. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: schetlur <[email protected]>
3b371f5 to
970e9ab
Compare
Signed-off-by: Sharan Chetlur <[email protected]>
|
/bot skip --comment "doc change" |
|
PR_Github #39771 [ skip ] triggered by Bot. Commit: |
|
PR_Github #39771 [ skip ] completed with state |
… 11 (NVIDIA#12338) Signed-off-by: schetlur <[email protected]> Signed-off-by: Sharan Chetlur <[email protected]> Co-authored-by: Claude Sonnet 4.6 <[email protected]>
Automated review of tech blog accuracy against current codebase found the following issues, now corrected:
blog2 (MTP):
blog3 (Throughput):
blog4 (WideEP Part 1):
blog8 (WideEP Part 2):
blog9 (GPT-OSS):
blog11 (GPT-OSS Eagle3):