-
Notifications
You must be signed in to change notification settings - Fork 682
[Executor] Default use CUDAGraph #3594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Executor] Default use CUDAGraph #3594
Conversation
|
Thanks for your contribution! |
gzy19990617
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #3594 +/- ##
==========================================
Coverage ? 62.50%
==========================================
Files ? 2
Lines ? 16
Branches ? 7
==========================================
Hits ? 10
Misses ? 4
Partials ? 2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
404008c to
8c764a8
Compare
be3b980 to
ce6a539
Compare
yuanlehome
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM👍
|
fastdeploy/model_executor/graph_optimization/utils.py:新增显存 Debug 工具,无须单测 |
| max_model_len: 32768 | ||
| max_num_seqs: 96 | ||
| gpu_memory_utilization: 0.9 | ||
| gpu_memory_utilization: 0.85 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpu_memory_utilization修改 也不会导致性能下降吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
gpu_memory_utilization修改 也不会导致性能下降吗?
从测试结果来看是上升的
Summary
CUDAGraphandSpeculate DecodingCUDAGraphandMultiModelCUDAGraphandRL trainingCUDAGraphandPD DisaggregationStatic GraphandRL trainingUser interface changes
The
--use-cudagraphstartup parameter is deleted, you can also manually control the CUDAGraph by setting--graph-optimization-config.Detailed description reference graph_optimization.md