[None][doc] fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11 by schetlur-nv · Pull Request #12338 · NVIDIA/TensorRT-LLM

schetlur-nv · 2026-03-19T04:14:16Z

Automated review of tech blog accuracy against current codebase found the following issues, now corrected:

blog2 (MTP):

Fix CLI flag: --spec_decode_nextn -> --spec_decode_max_draft_len (x2)
Rename config file examples to extra-llm-api-config.yml for clarity

blog3 (Throughput):

Fix broken API link: pyexecutor/config.py -> llmapi/llm_args.py#L102 (CudaGraphConfig moved to llm_args.py)
Fix NVIDIA Model Optimizer GitHub URL

blog4 (WideEP Part 1):

Fix two broken feat/large-ep branch links -> main branch
Fix all ep_load_balancer/ script paths -> wide_ep/ep_load_balancer/

blog8 (WideEP Part 2):

Note that examples/wide_ep/slurm_scripts is not yet public
Fix perf-analysis doc path

blog9 (GPT-OSS):

Fix container image tag text to match actual docker command (1.1.0rc1)

blog11 (GPT-OSS Eagle3):

Replace pinned RC image tag with NGC catalog reference
Fix use_torch_sampler -> sampler_type: TorchSampler (current API)
Fix decoding_type: Eagle3 -> decoding_type: Eagle

coderabbitai · 2026-03-19T04:18:44Z

📝 Walkthrough

Walkthrough

Documentation updates across six blog posts standardizing CLI configuration handling and correcting references. Key changes include replacing --config with --extra_llm_api_options flags, updating container image tags and documentation links, renaming configuration files, and migrating script references from development to main branch paths.

Changes

Cohort / File(s)	Summary
CLI Configuration Flag Migration `blog11_GPT_OSS_Eagle3.md`, `blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`, `blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`, `blog9_Deploying_GPT_OSS_on_TRTLLM.md`	Replaced `--config` with `--extra_llm_api_options` in server and benchmark commands; updated corresponding config file naming conventions (e.g., `./config.yml` → `./extra-llm-api-config.yml`).
Container Image & Release Updates `blog11_GPT_OSS_Eagle3.md`, `blog9_Deploying_GPT_OSS_on_TRTLLM.md`	Updated TensorRT-LLM NGC container image tags and removed fixed version pinning; changed Eagle configuration from `Eagle3` to `Eagle` with updated sampler type notation.
Documentation & Script Path References `blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md`, `blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`, `blog8_Scaling_Expert_Parallelism_in_TensorRT-LLM_part2.md`, `blog9_Deploying_GPT_OSS_on_TRTLLM.md`	Corrected documentation links, updated script paths from `feat/large-ep` branch to `main` branch (e.g., `examples/ep_load_balancer` → `examples/wide_ep/ep_load_balancer`), and fixed API reference documentation URLs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check	✅ Passed	The title accurately summarizes the main change: fixing outdated code references in multiple tech blog documentation files.
Description check	✅ Passed	The PR description clearly explains all changes made across multiple tech blogs with specific details about code references, API updates, and CLI flag corrections.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

📝 Coding Plan

Generate coding plan for human review comments

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md (2)
624-636: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This EPLB benchmark command also uses --extra_llm_api_options with trtllm-bench, violating the same coding guideline.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."
📋 Proposed fix to align with coding guidelines
-cat > ./extra_llm_api_options_eplb.yaml <<EOF
+cat > ./config_eplb.yaml <<EOF
 enable_attention_dp: true
 moe_config:
   load_balancer: ./moe_load_balancer.yaml
 EOF

 trtllm-llmapi-launch \
 trtllm-bench --model ${MODEL_NAME} \
     --model_path ${MODEL_PATH} \
     throughput \
     --tp 36 \
     --ep 36 \
-    --extra_llm_api_options ./extra_llm_api_options_eplb.yaml \
+    --config ./config_eplb.yaml \
     --kv_cache_free_gpu_mem_fraction 0.75 \
     --backend pytorch \
     --dataset ./dataset.json \
     --warmup 0 \
     --eos_id -1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`
around lines 624 - 636, Replace the use of the deprecated flag in the documented
benchmark invocation: in the trtllm-bench command that currently passes
--extra_llm_api_options ./extra_llm_api_options_eplb.yaml, change it to use
--config ./extra_llm_api_options_eplb.yaml so the invocation follows the coding
guideline; update the surrounding example lines where trtllm-bench and the
extra_llm_api_options_eplb.yaml token appear to reflect the --config flag
consistently.
544-554: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This baseline benchmark command uses --extra_llm_api_options with trtllm-bench, violating the coding guidelines.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."
📋 Proposed fix to align with coding guidelines
-cat > ./extra_llm_api_options.yaml <<EOF
+cat > ./config.yaml <<EOF
 enable_attention_dp: true
 EOF

 trtllm-llmapi-launch \
 trtllm-bench --model ${MODEL_NAME} \
     --model_path ${MODEL_PATH} \
     throughput \
     --tp 32 \
     --ep 32 \
-    --extra_llm_api_options ./extra_llm_api_options.yaml \
+    --config ./config.yaml \
     --kv_cache_free_gpu_mem_fraction 0.75 \
     --backend pytorch \
     --dataset ./dataset.json \
     --warmup 0 \
     --eos_id -1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`
around lines 544 - 554, The CLI example uses the deprecated flag
--extra_llm_api_options with trtllm-bench; change it to use the canonical
--config flag and update the referenced YAML filename accordingly (e.g., create
a config file like extra_llm_api_options.yaml or rename it to
extra_llm_config.yaml) and pass it to trtllm-bench as --config <file>, ensuring
the surrounding invocation (trtllm-llmapi-launch and trtllm-bench --model
${MODEL_NAME} --model_path ${MODEL_PATH} throughput --tp 32 --ep 32) remains
unchanged except for replacing --extra_llm_api_options
./extra_llm_api_options.yaml with --config ./<updated_filename>.yaml so the
example conforms to the coding guideline.
docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md (2)
125-145: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This section uses --extra_llm_api_options with trtllm-bench, but the repository coding guidelines explicitly state: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."

The YAML filename and flag should be reverted to use --config and config.yml respectively.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."
📋 Proposed fix to align with coding guidelines
-cat >./extra-llm-api-config.yml<<EOF
+cat >./config.yml<<EOF
 cuda_graph_config: {}
 moe_config:
   backend: TRTLLM
 speculative_config:
     decoding_type: MTP
     num_nextn_predict_layers: 3
 EOF

 export TRTLLM_ENABLE_PDL=1

 trtllm-bench --model nvidia/DeepSeek-R1-FP4 \
     throughput \
     --dataset $YOUR_DATA_PATH \
     --backend pytorch \
     --num_requests 10 \
     --concurrency 1 \
     --max_batch_size 1 \
     --tp 8 \
     --ep 2 \
-    --extra_llm_api_options ./extra-llm-api-config.yml
+    --config ./config.yml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`
around lines 125 - 145, The CLI example uses the deprecated flag and filename;
replace the --extra_llm_api_options flag and extra-llm-api-config.yml with the
canonical --config flag and config.yml: create the YAML as config.yml
(previously extra-llm-api-config.yml) and update the trtllm-bench invocation to
use --config ./config.yml; locate the trtllm-bench command and the mention of
--extra_llm_api_options in the file and change those symbols accordingly.
181-204: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This section also uses --extra_llm_api_options with trtllm-bench, violating the same coding guideline as the previous section.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."
📋 Proposed fix to align with coding guidelines
-cat >./extra-llm-api-config.yml<<EOF
+cat >./config.yml<<EOF
 cuda_graph_config: {}
 moe_config:
   backend: TRTLLM
 speculative_config:
     decoding_type: MTP
     num_nextn_predict_layers: 3
     use_relaxed_acceptance_for_thinking: true
     relaxed_topk: 10
     relaxed_delta: 0.6
 EOF

 export TRTLLM_ENABLE_PDL=1

 trtllm-bench --model nvidia/DeepSeek-R1-FP4 \
     throughput \
     --dataset $YOUR_DATA_PATH \
     --backend pytorch \
     --num_requests 10 \
     --concurrency 1 \
     --max_batch_size 1 \
     --tp 8 \
     --ep 2 \
-    --extra_llm_api_options ./extra-llm-api-config.yml
+    --config ./config.yml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`
around lines 181 - 204, Replace the --extra_llm_api_options usage with the
canonical --config flag in the trtllm-bench invocation: keep the same config
file (extra-llm-api-config.yml) and environment setup (TRTLLM_ENABLE_PDL) and
change the command argument from --extra_llm_api_options
./extra-llm-api-config.yml to --config ./extra-llm-api-config.yml so the
trtllm-bench call (the trtllm-bench throughput command) follows the project CLI
guideline.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md`:
- Line 109: The documented trtllm-serve invocation uses the discouraged flag
`--extra_llm_api_options`; update the command to use the preferred `--config`
flag instead (e.g., replace `--extra_llm_api_options
/config/models/eagle/eagle.yaml` with `--config
/config/models/eagle/eagle.yaml`) while keeping the rest of the invocation
(TRTLLM_ENABLE_PDL, trtllm-serve, model path, host/port, batching and token
options, and trust_remote_code) unchanged so it follows the TensorRT-LLM CLI
coding guideline.

In
`@docs/source/blogs/tech_blog/blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md`:
- Line 32: Update the GitHub repository reference in the blog text: replace the
URL and display name "NVIDIA/TensorRT-Model-Optimizer" with the canonical
"NVIDIA/Model-Optimizer" wherever it appears (the sentence referencing the Model
Optimizer in the DeepSeek checkpoint description) so the link points to and
displays the correct repository name.

In `@docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md`:
- Line 89: The trtllm-bench command in the doc uses the deprecated flag
--extra_llm_api_options; update the CLI example to use --config instead (i.e.,
replace --extra_llm_api_options low_latency.yaml with --config low_latency.yaml)
so the documented command for trtllm-bench conforms to the coding guideline;
ensure any mention of the old flag is removed and the file low_latency.yaml
remains the referenced config.
- Line 204: The documented CLI uses the disallowed flag --extra_llm_api_options
in the trtllm-serve example; change that invocation to use --config instead,
pointing it at the same YAML (max_throughput.yaml) so the command follows the
coding guideline; update any mention of --extra_llm_api_options in the
trtllm-serve example to --config and ensure the surrounding text still describes
loading the same configuration file.
- Line 152: The doc uses the deprecated flag --extra_llm_api_options in the
trtllm-bench max-throughput command; update the command sample to use --config
instead (replace the --extra_llm_api_options max_throughput.yaml occurrence with
--config max_throughput.yaml) so it follows the coding guideline for
TensorRT-LLM CLI tools (affecting the max-throughput trtllm-bench example in the
blog9_Deploying_GPT_OSS_on_TRTLLM.md file).
- Line 187: The command snippet uses the deprecated/forbidden flag
--extra_llm_api_options; update the documented trtllm-serve invocation to use
--config instead and point it to the same YAML file (e.g., replace the
--extra_llm_api_options max_throughput.yaml token with --config
max_throughput.yaml) so the documentation aligns with the coding guideline
preferring --config for TensorRT-LLM CLI configuration files.

---

Outside diff comments:
In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`:
- Around line 125-145: The CLI example uses the deprecated flag and filename;
replace the --extra_llm_api_options flag and extra-llm-api-config.yml with the
canonical --config flag and config.yml: create the YAML as config.yml
(previously extra-llm-api-config.yml) and update the trtllm-bench invocation to
use --config ./config.yml; locate the trtllm-bench command and the mention of
--extra_llm_api_options in the file and change those symbols accordingly.
- Around line 181-204: Replace the --extra_llm_api_options usage with the
canonical --config flag in the trtllm-bench invocation: keep the same config
file (extra-llm-api-config.yml) and environment setup (TRTLLM_ENABLE_PDL) and
change the command argument from --extra_llm_api_options
./extra-llm-api-config.yml to --config ./extra-llm-api-config.yml so the
trtllm-bench call (the trtllm-bench throughput command) follows the project CLI
guideline.

In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`:
- Around line 624-636: Replace the use of the deprecated flag in the documented
benchmark invocation: in the trtllm-bench command that currently passes
--extra_llm_api_options ./extra_llm_api_options_eplb.yaml, change it to use
--config ./extra_llm_api_options_eplb.yaml so the invocation follows the coding
guideline; update the surrounding example lines where trtllm-bench and the
extra_llm_api_options_eplb.yaml token appear to reflect the --config flag
consistently.
- Around line 544-554: The CLI example uses the deprecated flag
--extra_llm_api_options with trtllm-bench; change it to use the canonical
--config flag and update the referenced YAML filename accordingly (e.g., create
a config file like extra_llm_api_options.yaml or rename it to
extra_llm_config.yaml) and pass it to trtllm-bench as --config <file>, ensuring
the surrounding invocation (trtllm-llmapi-launch and trtllm-bench --model
${MODEL_NAME} --model_path ${MODEL_PATH} throughput --tp 32 --ep 32) remains
unchanged except for replacing --extra_llm_api_options
./extra_llm_api_options.yaml with --config ./<updated_filename>.yaml so the
example conforms to the coding guideline.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fbd180dd-1532-497c-958f-18a693f8c179

📥 Commits

Reviewing files that changed from the base of the PR and between e940e58 and 9adca18.

📒 Files selected for processing (6)

docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md
docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md
docs/source/blogs/tech_blog/blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md
docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md
docs/source/blogs/tech_blog/blog8_Scaling_Expert_Parallelism_in_TensorRT-LLM_part2.md
docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md

schetlur-nv · 2026-03-19T21:56:25Z

/bot skip --comment "Doc change"

tensorrt-cicd · 2026-03-19T22:02:36Z

PR_Github #39639 [ skip ] triggered by Bot. Commit: 3b371f5 Link to invocation

tensorrt-cicd · 2026-03-19T22:14:59Z

PR_Github #39639 [ skip ] completed with state SUCCESS. Commit: 3b371f5
Skipping testing for commit 3b371f5

Link to invocation

Automated review of tech blog accuracy against current codebase found the following issues, now corrected: blog2 (MTP): - Fix CLI flag: --spec_decode_nextn -> --spec_decode_max_draft_len (x2) - Fix config file flag: --config -> --extra_llm_api_options (x2) - Rename config file examples to extra-llm-api-config.yml for clarity blog3 (Throughput): - Fix broken API link: pyexecutor/config.py -> llmapi/llm_args.py#L102 (CudaGraphConfig moved to llm_args.py) - Fix NVIDIA Model Optimizer GitHub URL blog4 (WideEP Part 1): - Fix two broken feat/large-ep branch links -> main branch - Fix all ep_load_balancer/ script paths -> wide_ep/ep_load_balancer/ - Fix --config flag -> --extra_llm_api_options in benchmark commands blog8 (WideEP Part 2): - Note that examples/wide_ep/slurm_scripts is not yet public - Fix perf-analysis doc path blog9 (GPT-OSS): - Fix container image tag text to match actual docker command (1.1.0rc1) - Fix --config flag -> --extra_llm_api_options throughout blog11 (GPT-OSS Eagle3): - Replace pinned RC image tag with NGC catalog reference - Fix use_torch_sampler -> sampler_type: TorchSampler (current API) - Fix decoding_type: Eagle3 -> decoding_type: Eagle - Fix --config -> --extra_llm_api_options in serve command Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: schetlur <[email protected]>

--config is a valid alias for --extra_llm_api_options in trtllm-bench, trtllm-serve, and trtllm-eval. Both are accepted by the CLI and neither is deprecated. Revert the unnecessary alias substitutions introduced in the previous commit, keeping only the substantively correct fixes. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: schetlur <[email protected]>

NVIDIA/TensorRT-Model-Optimizer is a redirect; the canonical repository name is NVIDIA/Model-Optimizer. Restore the original URL from the blog. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: schetlur <[email protected]>

Signed-off-by: Sharan Chetlur <[email protected]>

schetlur-nv · 2026-03-20T21:23:13Z

/bot skip --comment "doc change"

tensorrt-cicd · 2026-03-20T21:29:59Z

PR_Github #39771 [ skip ] triggered by Bot. Commit: e536e37 Link to invocation

tensorrt-cicd · 2026-03-20T21:40:50Z

PR_Github #39771 [ skip ] completed with state SUCCESS. Commit: e536e37
Skipping testing for commit e536e37

Link to invocation

… 11 (NVIDIA#12338) Signed-off-by: schetlur <[email protected]> Signed-off-by: Sharan Chetlur <[email protected]> Co-authored-by: Claude Sonnet 4.6 <[email protected]>

schetlur-nv requested a review from a team as a code owner March 19, 2026 04:14

schetlur-nv requested review from QiJune and arysef March 19, 2026 04:14

github-actions Bot assigned schetlur-nv Mar 19, 2026

schetlur-nv requested review from dongxuy04, kaiyux and mikeiovine March 19, 2026 04:14

schetlur-nv changed the title ~~1docs: fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11~~ [None][doc] fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11 Mar 19, 2026

coderabbitai Bot reviewed Mar 19, 2026

View reviewed changes

schetlur-nv force-pushed the fix/tech-blog-code-accuracy branch from 9adca18 to d8b143b Compare March 19, 2026 04:23

kaiyux approved these changes Mar 19, 2026

View reviewed changes

mikeiovine approved these changes Mar 19, 2026

View reviewed changes

Comment thread docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md Outdated

schetlur-nv and others added 3 commits March 20, 2026 14:19

schetlur-nv force-pushed the fix/tech-blog-code-accuracy branch from 3b371f5 to 970e9ab Compare March 20, 2026 21:19

Update blog11_GPT_OSS_Eagle3.md

e536e37

Signed-off-by: Sharan Chetlur <[email protected]>

schetlur-nv merged commit 90c1cb7 into NVIDIA:main Mar 23, 2026
5 checks passed

Conversation

schetlur-nv commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

schetlur-nv commented Mar 19, 2026

Uh oh!

tensorrt-cicd commented Mar 19, 2026

Uh oh!

tensorrt-cicd commented Mar 19, 2026

Uh oh!

schetlur-nv commented Mar 20, 2026

Uh oh!

tensorrt-cicd commented Mar 20, 2026

Uh oh!

tensorrt-cicd commented Mar 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

schetlur-nv commented Mar 19, 2026 •

edited

Loading

coderabbitai Bot commented Mar 19, 2026 •

edited

Loading