[Executor] Default use CUDAGraph #3594

gongshaotian · 2025-08-25T08:59:28Z

Summary

CUDAGraph has been enabled by default in some scenarios at the beginning of this PR.
CUDAGraph will be automatically closed for functions that are not compatible with CUDAGraph (speculative decoding, RL training, multi-mode model). Added some GraphOptConfig related startup parameter checks:
- ❌ CUDAGraph and Speculate Decoding
  - Follow PR: [FDConfig]Turn on the CUDAGraph + Speculative Decoding switch #4511
- ❌ CUDAGraph and MultiModel
  - Follow PR: [FDConfig]Turn on the CUDAGraph + MultiModel switch #4512
- ❌ CUDAGraph and RL training
  - Follow PR: [FDConfig]Turn on the CUDAGraph + RL switch #4508
- ❌ CUDAGraph and PD Disaggregation
  - Follow PR: [FDConfig]Turn on the CUDAGraph + PD Disaggregation switch #4530
- ❌ Static Graph and RL training

User interface changes

The --use-cudagraph startup parameter is deleted, you can also manually control the CUDAGraph by setting --graph-optimization-config .

--graph-optimization-config '{"use_cudagraph":false}'

Detailed description reference graph_optimization.md

paddle-bot · 2025-08-25T08:59:36Z

Thanks for your contribution!

gzy19990617

LGTM

codecov-commenter · 2025-08-25T10:27:00Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@43d5bd6). Learn more about missing BASE report.

Additional details and impacted files

@@            Coverage Diff             @@
##             develop    #3594   +/-   ##
==========================================
  Coverage           ?   62.50%           
==========================================
  Files              ?        2           
  Lines              ?       16           
  Branches           ?        7           
==========================================
  Hits               ?       10           
  Misses             ?        4           
  Partials           ?        2

Flag	Coverage Δ
diff	`62.50% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tests/ce/deploy/21b_sot.yaml

docs/parameters.md

docs/zh/features/graph_optimization.md

fastdeploy/config.py

docs/zh/features/graph_optimization.md

…into start_intercept

yuanlehome

LGTM👍

gongshaotian · 2025-10-21T06:16:44Z

fastdeploy/model_executor/graph_optimization/utils.py：新增显存 Debug 工具，无须单测
fastdeploy/worker/gpu_model_runner.py 和 fastdeploy/config.py：新增防御性代码，非报错无法覆盖
申请豁免覆盖率@Jiang-Jia-Jun

chang-wenbin · 2025-10-21T06:32:11Z

benchmarks/yaml/eb45-32k-wint4-a800-tp4.yaml

 max_model_len: 32768
 max_num_seqs: 96
-gpu_memory_utilization: 0.9
+gpu_memory_utilization: 0.85


gpu_memory_utilization修改也不会导致性能下降吗？

gpu_memory_utilization修改也不会导致性能下降吗？

从测试结果来看是上升的

add start intercept

3812e96

gongshaotian changed the title ~~[Config] Add GraphOptConfig start intercept~~ [Config] Add GraphOptConfig start interception Aug 25, 2025

gzy19990617 reviewed Aug 25, 2025

View reviewed changes

YuanRisheng previously approved these changes Aug 25, 2025

View reviewed changes

Adjustment GraphOptConfig

7f46585

gongshaotian dismissed YuanRisheng’s stale review via 7f46585 August 25, 2025 09:42

pre-commit

5d2c42a

default use cudagraph

2591fad

gongshaotian changed the title ~~[Config] Add GraphOptConfig start interception~~ [Executor] Default use CUDAGraph Aug 25, 2025

gongshaotian added 2 commits August 27, 2025 20:14

set default value

f3873c3

merge develop

77dcc2c