-
Notifications
You must be signed in to change notification settings - Fork 4.7k
Add getter APIs for TP/PP/DP ranks in DeepSpeedEngine #7427
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add getter APIs for TP/PP/DP ranks in DeepSpeedEngine #7427
Conversation
5df4bcc to
718fb78
Compare
Signed-off-by: WoosungMyung <[email protected]>
718fb78 to
cc89694
Compare
2ba0c78 to
726c30e
Compare
Signed-off-by: WoosungMyung <[email protected]>
c4b8193 to
6c8e092
Compare
|
@sfc-gh-truwase |
|
@delock can you please help with the xpu CI? Thanks! |
I've also reached out to @Liangliang-Ma on this as well. The test is currently skipped on this PR so it should merge now. |
|
Thanks a lot for merging the previous PR! I really appreciate the review and guidance throughout the process! |
Thanks again for giving opportunity for improving this Community! This PR is from Issue deepspeedai#7423. 1) Motivation To improve compatibility with low-level profiling tools (e.g., NVIDIA CUPTI or DCGM), it can be useful to expose parallelism-specific rank (tensor/pipeline/data) at the engine level. 2) Changes I Added three getter methods to DeepSpeedEngine: - get_tensor_parallel_rank() - get_pipeline_parallel_rank() - get_data_parallel_rank() Thank you for reviewing this contribution! --------- Signed-off-by: WoosungMyung <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: lym <[email protected]>
Thanks again for giving opportunity for improving this Community! This PR is from Issue deepspeedai#7423. 1) Motivation To improve compatibility with low-level profiling tools (e.g., NVIDIA CUPTI or DCGM), it can be useful to expose parallelism-specific rank (tensor/pipeline/data) at the engine level. 2) Changes I Added three getter methods to DeepSpeedEngine: - get_tensor_parallel_rank() - get_pipeline_parallel_rank() - get_data_parallel_rank() Thank you for reviewing this contribution! --------- Signed-off-by: WoosungMyung <[email protected]> Co-authored-by: Logan Adams <[email protected]>
Thanks again for giving opportunity for improving this Community!
This PR is from Issue #7423.
To improve compatibility with low-level profiling tools (e.g., NVIDIA CUPTI or DCGM), it can be useful to expose parallelism-specific rank (tensor/pipeline/data) at the engine level.
I Added three getter methods to DeepSpeedEngine:
Thank you for reviewing this contribution!