-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Labels
better-engineeringRelatively self-contained tasks for better engineering contributorsRelatively self-contained tasks for better engineering contributorsmodule: rpcRelated to RPC, distributed autograd, RRef, and distributed optimizerRelated to RPC, distributed autograd, RRef, and distributed optimizertriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module
Description
🚀 Feature
Per @pietern's comment in #29928, we can avoid holding the lock in ProcessGroupGloo when waiting on send/recv operations:
"The lock is there only to synchronize a background thread that executes the work and marks it as completed, not multiple waiters. For the send/recv work, that background thread is the actual Gloo I/O thread that marks a send/recv as completed, so for this case we could get rid of the locks entirely."
See comments in the PR for additional context.
cc @pietern @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @rohan-varma @xush6528
Metadata
Metadata
Assignees
Labels
better-engineeringRelatively self-contained tasks for better engineering contributorsRelatively self-contained tasks for better engineering contributorsmodule: rpcRelated to RPC, distributed autograd, RRef, and distributed optimizerRelated to RPC, distributed autograd, RRef, and distributed optimizertriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate module