Skip to content

[None][doc] fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11#12338

Merged
schetlur-nv merged 4 commits into
NVIDIA:mainfrom
schetlur-nv:fix/tech-blog-code-accuracy
Mar 23, 2026
Merged

[None][doc] fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11#12338
schetlur-nv merged 4 commits into
NVIDIA:mainfrom
schetlur-nv:fix/tech-blog-code-accuracy

Conversation

@schetlur-nv
Copy link
Copy Markdown
Collaborator

@schetlur-nv schetlur-nv commented Mar 19, 2026

Automated review of tech blog accuracy against current codebase found the following issues, now corrected:

blog2 (MTP):

  • Fix CLI flag: --spec_decode_nextn -> --spec_decode_max_draft_len (x2)
  • Rename config file examples to extra-llm-api-config.yml for clarity

blog3 (Throughput):

  • Fix broken API link: pyexecutor/config.py -> llmapi/llm_args.py#L102 (CudaGraphConfig moved to llm_args.py)
  • Fix NVIDIA Model Optimizer GitHub URL

blog4 (WideEP Part 1):

  • Fix two broken feat/large-ep branch links -> main branch
  • Fix all ep_load_balancer/ script paths -> wide_ep/ep_load_balancer/

blog8 (WideEP Part 2):

  • Note that examples/wide_ep/slurm_scripts is not yet public
  • Fix perf-analysis doc path

blog9 (GPT-OSS):

  • Fix container image tag text to match actual docker command (1.1.0rc1)

blog11 (GPT-OSS Eagle3):

  • Replace pinned RC image tag with NGC catalog reference
  • Fix use_torch_sampler -> sampler_type: TorchSampler (current API)
  • Fix decoding_type: Eagle3 -> decoding_type: Eagle

@schetlur-nv schetlur-nv requested a review from a team as a code owner March 19, 2026 04:14
@schetlur-nv schetlur-nv requested review from QiJune and arysef March 19, 2026 04:14
@schetlur-nv schetlur-nv changed the title 1docs: fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11 [None][doc] fix outdated code references in tech blogs 2, 3, 4, 8, 9, 11 Mar 19, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 19, 2026

📝 Walkthrough

Walkthrough

Documentation updates across six blog posts standardizing CLI configuration handling and correcting references. Key changes include replacing --config with --extra_llm_api_options flags, updating container image tags and documentation links, renaming configuration files, and migrating script references from development to main branch paths.

Changes

Cohort / File(s) Summary
CLI Configuration Flag Migration
blog11_GPT_OSS_Eagle3.md, blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md, blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md, blog9_Deploying_GPT_OSS_on_TRTLLM.md
Replaced --config with --extra_llm_api_options in server and benchmark commands; updated corresponding config file naming conventions (e.g., ./config.yml./extra-llm-api-config.yml).
Container Image & Release Updates
blog11_GPT_OSS_Eagle3.md, blog9_Deploying_GPT_OSS_on_TRTLLM.md
Updated TensorRT-LLM NGC container image tags and removed fixed version pinning; changed Eagle configuration from Eagle3 to Eagle with updated sampler type notation.
Documentation & Script Path References
blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md, blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md, blog8_Scaling_Expert_Parallelism_in_TensorRT-LLM_part2.md, blog9_Deploying_GPT_OSS_on_TRTLLM.md
Corrected documentation links, updated script paths from feat/large-ep branch to main branch (e.g., examples/ep_load_balancerexamples/wide_ep/ep_load_balancer), and fixed API reference documentation URLs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Title check ✅ Passed The title accurately summarizes the main change: fixing outdated code references in multiple tech blog documentation files.
Description check ✅ Passed The PR description clearly explains all changes made across multiple tech blogs with specific details about code references, API updates, and CLI flag corrections.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can use OpenGrep to find security vulnerabilities and bugs across 17+ programming languages.

OpenGrep is compatible with Semgrep configurations. Add an opengrep.yml or semgrep.yml configuration file to your project to enable OpenGrep analysis.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md (2)

624-636: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This EPLB benchmark command also uses --extra_llm_api_options with trtllm-bench, violating the same coding guideline.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."

📋 Proposed fix to align with coding guidelines
-cat > ./extra_llm_api_options_eplb.yaml <<EOF
+cat > ./config_eplb.yaml <<EOF
 enable_attention_dp: true
 moe_config:
   load_balancer: ./moe_load_balancer.yaml
 EOF

 trtllm-llmapi-launch \
 trtllm-bench --model ${MODEL_NAME} \
     --model_path ${MODEL_PATH} \
     throughput \
     --tp 36 \
     --ep 36 \
-    --extra_llm_api_options ./extra_llm_api_options_eplb.yaml \
+    --config ./config_eplb.yaml \
     --kv_cache_free_gpu_mem_fraction 0.75 \
     --backend pytorch \
     --dataset ./dataset.json \
     --warmup 0 \
     --eos_id -1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`
around lines 624 - 636, Replace the use of the deprecated flag in the documented
benchmark invocation: in the trtllm-bench command that currently passes
--extra_llm_api_options ./extra_llm_api_options_eplb.yaml, change it to use
--config ./extra_llm_api_options_eplb.yaml so the invocation follows the coding
guideline; update the surrounding example lines where trtllm-bench and the
extra_llm_api_options_eplb.yaml token appear to reflect the --config flag
consistently.

544-554: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This baseline benchmark command uses --extra_llm_api_options with trtllm-bench, violating the coding guidelines.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."

📋 Proposed fix to align with coding guidelines
-cat > ./extra_llm_api_options.yaml <<EOF
+cat > ./config.yaml <<EOF
 enable_attention_dp: true
 EOF

 trtllm-llmapi-launch \
 trtllm-bench --model ${MODEL_NAME} \
     --model_path ${MODEL_PATH} \
     throughput \
     --tp 32 \
     --ep 32 \
-    --extra_llm_api_options ./extra_llm_api_options.yaml \
+    --config ./config.yaml \
     --kv_cache_free_gpu_mem_fraction 0.75 \
     --backend pytorch \
     --dataset ./dataset.json \
     --warmup 0 \
     --eos_id -1
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`
around lines 544 - 554, The CLI example uses the deprecated flag
--extra_llm_api_options with trtllm-bench; change it to use the canonical
--config flag and update the referenced YAML filename accordingly (e.g., create
a config file like extra_llm_api_options.yaml or rename it to
extra_llm_config.yaml) and pass it to trtllm-bench as --config <file>, ensuring
the surrounding invocation (trtllm-llmapi-launch and trtllm-bench --model
${MODEL_NAME} --model_path ${MODEL_PATH} throughput --tp 32 --ep 32) remains
unchanged except for replacing --extra_llm_api_options
./extra_llm_api_options.yaml with --config ./<updated_filename>.yaml so the
example conforms to the coding guideline.
docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md (2)

125-145: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This section uses --extra_llm_api_options with trtllm-bench, but the repository coding guidelines explicitly state: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."

The YAML filename and flag should be reverted to use --config and config.yml respectively.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."

📋 Proposed fix to align with coding guidelines
-cat >./extra-llm-api-config.yml<<EOF
+cat >./config.yml<<EOF
 cuda_graph_config: {}
 moe_config:
   backend: TRTLLM
 speculative_config:
     decoding_type: MTP
     num_nextn_predict_layers: 3
 EOF

 export TRTLLM_ENABLE_PDL=1

 trtllm-bench --model nvidia/DeepSeek-R1-FP4 \
     throughput \
     --dataset $YOUR_DATA_PATH \
     --backend pytorch \
     --num_requests 10 \
     --concurrency 1 \
     --max_batch_size 1 \
     --tp 8 \
     --ep 2 \
-    --extra_llm_api_options ./extra-llm-api-config.yml
+    --config ./config.yml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`
around lines 125 - 145, The CLI example uses the deprecated flag and filename;
replace the --extra_llm_api_options flag and extra-llm-api-config.yml with the
canonical --config flag and config.yml: create the YAML as config.yml
(previously extra-llm-api-config.yml) and update the trtllm-bench invocation to
use --config ./config.yml; locate the trtllm-bench command and the mention of
--extra_llm_api_options in the file and change those symbols accordingly.

181-204: ⚠️ Potential issue | 🔴 Critical

CRITICAL: Coding guideline violation - prefer --config over --extra_llm_api_options.

This section also uses --extra_llm_api_options with trtllm-bench, violating the same coding guideline as the previous section.

As per coding guidelines: "When documenting CLI commands for TensorRT-LLM tools like trtllm-serve, trtllm-bench, trtllm-eval, prefer using --config over --extra_llm_api_options for specifying configuration files."

📋 Proposed fix to align with coding guidelines
-cat >./extra-llm-api-config.yml<<EOF
+cat >./config.yml<<EOF
 cuda_graph_config: {}
 moe_config:
   backend: TRTLLM
 speculative_config:
     decoding_type: MTP
     num_nextn_predict_layers: 3
     use_relaxed_acceptance_for_thinking: true
     relaxed_topk: 10
     relaxed_delta: 0.6
 EOF

 export TRTLLM_ENABLE_PDL=1

 trtllm-bench --model nvidia/DeepSeek-R1-FP4 \
     throughput \
     --dataset $YOUR_DATA_PATH \
     --backend pytorch \
     --num_requests 10 \
     --concurrency 1 \
     --max_batch_size 1 \
     --tp 8 \
     --ep 2 \
-    --extra_llm_api_options ./extra-llm-api-config.yml
+    --config ./config.yml
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`
around lines 181 - 204, Replace the --extra_llm_api_options usage with the
canonical --config flag in the trtllm-bench invocation: keep the same config
file (extra-llm-api-config.yml) and environment setup (TRTLLM_ENABLE_PDL) and
change the command argument from --extra_llm_api_options
./extra-llm-api-config.yml to --config ./extra-llm-api-config.yml so the
trtllm-bench call (the trtllm-bench throughput command) follows the project CLI
guideline.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md`:
- Line 109: The documented trtllm-serve invocation uses the discouraged flag
`--extra_llm_api_options`; update the command to use the preferred `--config`
flag instead (e.g., replace `--extra_llm_api_options
/config/models/eagle/eagle.yaml` with `--config
/config/models/eagle/eagle.yaml`) while keeping the rest of the invocation
(TRTLLM_ENABLE_PDL, trtllm-serve, model path, host/port, batching and token
options, and trust_remote_code) unchanged so it follows the TensorRT-LLM CLI
coding guideline.

In
`@docs/source/blogs/tech_blog/blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md`:
- Line 32: Update the GitHub repository reference in the blog text: replace the
URL and display name "NVIDIA/TensorRT-Model-Optimizer" with the canonical
"NVIDIA/Model-Optimizer" wherever it appears (the sentence referencing the Model
Optimizer in the DeepSeek checkpoint description) so the link points to and
displays the correct repository name.

In `@docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md`:
- Line 89: The trtllm-bench command in the doc uses the deprecated flag
--extra_llm_api_options; update the CLI example to use --config instead (i.e.,
replace --extra_llm_api_options low_latency.yaml with --config low_latency.yaml)
so the documented command for trtllm-bench conforms to the coding guideline;
ensure any mention of the old flag is removed and the file low_latency.yaml
remains the referenced config.
- Line 204: The documented CLI uses the disallowed flag --extra_llm_api_options
in the trtllm-serve example; change that invocation to use --config instead,
pointing it at the same YAML (max_throughput.yaml) so the command follows the
coding guideline; update any mention of --extra_llm_api_options in the
trtllm-serve example to --config and ensure the surrounding text still describes
loading the same configuration file.
- Line 152: The doc uses the deprecated flag --extra_llm_api_options in the
trtllm-bench max-throughput command; update the command sample to use --config
instead (replace the --extra_llm_api_options max_throughput.yaml occurrence with
--config max_throughput.yaml) so it follows the coding guideline for
TensorRT-LLM CLI tools (affecting the max-throughput trtllm-bench example in the
blog9_Deploying_GPT_OSS_on_TRTLLM.md file).
- Line 187: The command snippet uses the deprecated/forbidden flag
--extra_llm_api_options; update the documented trtllm-serve invocation to use
--config instead and point it to the same YAML file (e.g., replace the
--extra_llm_api_options max_throughput.yaml token with --config
max_throughput.yaml) so the documentation aligns with the coding guideline
preferring --config for TensorRT-LLM CLI configuration files.

---

Outside diff comments:
In
`@docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md`:
- Around line 125-145: The CLI example uses the deprecated flag and filename;
replace the --extra_llm_api_options flag and extra-llm-api-config.yml with the
canonical --config flag and config.yml: create the YAML as config.yml
(previously extra-llm-api-config.yml) and update the trtllm-bench invocation to
use --config ./config.yml; locate the trtllm-bench command and the mention of
--extra_llm_api_options in the file and change those symbols accordingly.
- Around line 181-204: Replace the --extra_llm_api_options usage with the
canonical --config flag in the trtllm-bench invocation: keep the same config
file (extra-llm-api-config.yml) and environment setup (TRTLLM_ENABLE_PDL) and
change the command argument from --extra_llm_api_options
./extra-llm-api-config.yml to --config ./extra-llm-api-config.yml so the
trtllm-bench call (the trtllm-bench throughput command) follows the project CLI
guideline.

In
`@docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md`:
- Around line 624-636: Replace the use of the deprecated flag in the documented
benchmark invocation: in the trtllm-bench command that currently passes
--extra_llm_api_options ./extra_llm_api_options_eplb.yaml, change it to use
--config ./extra_llm_api_options_eplb.yaml so the invocation follows the coding
guideline; update the surrounding example lines where trtllm-bench and the
extra_llm_api_options_eplb.yaml token appear to reflect the --config flag
consistently.
- Around line 544-554: The CLI example uses the deprecated flag
--extra_llm_api_options with trtllm-bench; change it to use the canonical
--config flag and update the referenced YAML filename accordingly (e.g., create
a config file like extra_llm_api_options.yaml or rename it to
extra_llm_config.yaml) and pass it to trtllm-bench as --config <file>, ensuring
the surrounding invocation (trtllm-llmapi-launch and trtllm-bench --model
${MODEL_NAME} --model_path ${MODEL_PATH} throughput --tp 32 --ep 32) remains
unchanged except for replacing --extra_llm_api_options
./extra_llm_api_options.yaml with --config ./<updated_filename>.yaml so the
example conforms to the coding guideline.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fbd180dd-1532-497c-958f-18a693f8c179

📥 Commits

Reviewing files that changed from the base of the PR and between e940e58 and 9adca18.

📒 Files selected for processing (6)
  • docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md
  • docs/source/blogs/tech_blog/blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md
  • docs/source/blogs/tech_blog/blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md
  • docs/source/blogs/tech_blog/blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md
  • docs/source/blogs/tech_blog/blog8_Scaling_Expert_Parallelism_in_TensorRT-LLM_part2.md
  • docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md

Comment thread docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md Outdated
Comment thread docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md Outdated
Comment thread docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md Outdated
Comment thread docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md Outdated
Comment thread docs/source/blogs/tech_blog/blog9_Deploying_GPT_OSS_on_TRTLLM.md Outdated
@schetlur-nv schetlur-nv force-pushed the fix/tech-blog-code-accuracy branch from 9adca18 to d8b143b Compare March 19, 2026 04:23
Comment thread docs/source/blogs/tech_blog/blog11_GPT_OSS_Eagle3.md Outdated
@schetlur-nv
Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "Doc change"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39639 [ skip ] triggered by Bot. Commit: 3b371f5 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39639 [ skip ] completed with state SUCCESS. Commit: 3b371f5
Skipping testing for commit 3b371f5

Link to invocation

schetlur-nv and others added 3 commits March 20, 2026 14:19
Automated review of tech blog accuracy against current codebase found
the following issues, now corrected:

blog2 (MTP):
- Fix CLI flag: --spec_decode_nextn -> --spec_decode_max_draft_len (x2)
- Fix config file flag: --config -> --extra_llm_api_options (x2)
- Rename config file examples to extra-llm-api-config.yml for clarity

blog3 (Throughput):
- Fix broken API link: pyexecutor/config.py -> llmapi/llm_args.py#L102
  (CudaGraphConfig moved to llm_args.py)
- Fix NVIDIA Model Optimizer GitHub URL

blog4 (WideEP Part 1):
- Fix two broken feat/large-ep branch links -> main branch
- Fix all ep_load_balancer/ script paths -> wide_ep/ep_load_balancer/
- Fix --config flag -> --extra_llm_api_options in benchmark commands

blog8 (WideEP Part 2):
- Note that examples/wide_ep/slurm_scripts is not yet public
- Fix perf-analysis doc path

blog9 (GPT-OSS):
- Fix container image tag text to match actual docker command (1.1.0rc1)
- Fix --config flag -> --extra_llm_api_options throughout

blog11 (GPT-OSS Eagle3):
- Replace pinned RC image tag with NGC catalog reference
- Fix use_torch_sampler -> sampler_type: TorchSampler (current API)
- Fix decoding_type: Eagle3 -> decoding_type: Eagle
- Fix --config -> --extra_llm_api_options in serve command

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Signed-off-by: schetlur <[email protected]>
--config is a valid alias for --extra_llm_api_options in trtllm-bench,
trtllm-serve, and trtllm-eval. Both are accepted by the CLI and neither
is deprecated. Revert the unnecessary alias substitutions introduced in
the previous commit, keeping only the substantively correct fixes.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Signed-off-by: schetlur <[email protected]>
NVIDIA/TensorRT-Model-Optimizer is a redirect; the canonical repository
name is NVIDIA/Model-Optimizer. Restore the original URL from the blog.

Co-Authored-By: Claude Sonnet 4.6 <[email protected]>
Signed-off-by: schetlur <[email protected]>
@schetlur-nv schetlur-nv force-pushed the fix/tech-blog-code-accuracy branch from 3b371f5 to 970e9ab Compare March 20, 2026 21:19
Signed-off-by: Sharan Chetlur <[email protected]>
@schetlur-nv
Copy link
Copy Markdown
Collaborator Author

/bot skip --comment "doc change"

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39771 [ skip ] triggered by Bot. Commit: e536e37 Link to invocation

@tensorrt-cicd
Copy link
Copy Markdown
Collaborator

PR_Github #39771 [ skip ] completed with state SUCCESS. Commit: e536e37
Skipping testing for commit e536e37

Link to invocation

@schetlur-nv schetlur-nv merged commit 90c1cb7 into NVIDIA:main Mar 23, 2026
5 checks passed
longcheng-nv pushed a commit to longcheng-nv/TensorRT-LLM that referenced this pull request Mar 31, 2026
… 11 (NVIDIA#12338)

Signed-off-by: schetlur <[email protected]>
Signed-off-by: Sharan Chetlur <[email protected]>
Co-authored-by: Claude Sonnet 4.6 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants