Skip to content

Conversation

@gongshaotian
Copy link
Collaborator

@gongshaotian gongshaotian commented Aug 25, 2025

Summary

  1. CUDAGraph has been enabled by default in some scenarios at the beginning of this PR.
  2. CUDAGraph will be automatically closed for functions that are not compatible with CUDAGraph (speculative decoding, RL training, multi-mode model). Added some GraphOptConfig related startup parameter checks:

User interface changes

The --use-cudagraph startup parameter is deleted, you can also manually control the CUDAGraph by setting --graph-optimization-config .

--graph-optimization-config '{"use_cudagraph":false}'

Detailed description reference graph_optimization.md

@paddle-bot
Copy link

paddle-bot bot commented Aug 25, 2025

Thanks for your contribution!

@gongshaotian gongshaotian changed the title [Config] Add GraphOptConfig start intercept [Config] Add GraphOptConfig start interception Aug 25, 2025
Copy link
Collaborator

@gzy19990617 gzy19990617 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

YuanRisheng
YuanRisheng previously approved these changes Aug 25, 2025
@codecov-commenter
Copy link

codecov-commenter commented Aug 25, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (develop@43d5bd6). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff             @@
##             develop    #3594   +/-   ##
==========================================
  Coverage           ?   62.50%           
==========================================
  Files              ?        2           
  Lines              ?       16           
  Branches           ?        7           
==========================================
  Hits               ?       10           
  Misses             ?        4           
  Partials           ?        2           
Flag Coverage Δ
diff 62.50% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@gongshaotian gongshaotian changed the title [Config] Add GraphOptConfig start interception [Executor] Default use CUDAGraph Aug 25, 2025
@gongshaotian gongshaotian self-assigned this Aug 28, 2025
Copy link
Collaborator

@yuanlehome yuanlehome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM👍

@gongshaotian
Copy link
Collaborator Author

gongshaotian commented Oct 21, 2025

fastdeploy/model_executor/graph_optimization/utils.py:新增显存 Debug 工具,无须单测
fastdeploy/worker/gpu_model_runner.py 和 fastdeploy/config.py:新增防御性代码,非报错无法覆盖
申请豁免覆盖率@Jiang-Jia-Jun

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 775edcc into PaddlePaddle:develop Oct 21, 2025
23 of 28 checks passed
max_model_len: 32768
max_num_seqs: 96
gpu_memory_utilization: 0.9
gpu_memory_utilization: 0.85
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpu_memory_utilization修改 也不会导致性能下降吗?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gpu_memory_utilization修改 也不会导致性能下降吗?

从测试结果来看是上升的

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants