Skip to content

[Gloo, MPI] DDP communication hook: getFuture() #42048

@sinannasir

Description

@sinannasir

DDP communication hook implementation #39272 provides an API called get_future to retrieve a future associated with the completion of c10d.ProcessGroupNCCL.work. get_future API is currently only supported by NCCL #41596 , because most of the intensive models, where our gradient compression effort might be more useful, use NCCL.

To support Gloo and MPI, we can modify some of the core ProcessGroupGloo and ProcessGroupMPI code to ensure we can do something similar there.

cc @ezyang @gchanan @zou3519 @bdhirsh @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @agolynski @SciPioneer @H-Huang @mrzzd @xush6528

Metadata

Metadata

Assignees

Labels

better-engineeringRelatively self-contained tasks for better engineering contributorshigh priorityoncall: distributedAdd this issue/PR to distributed oncall triage queuetriage reviewtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions