Commit 03d792c
committed
Update on "[PGNCCL] Ensure comm is ready before all accesses"
Previously we only wait for comm to become ready after its initialization.
But that's not enough. There are other NCCL APIs that can cause the comm to be InProgress.
Therefore, we just ensure comm is ready every time we call `getNcclComm`, as a protection for subsequent NCCL call on the returned comm.
cc XilunWu H-Huang awgu wanchaol fegin fduwjj wz337 wconstab d4l3k c-p-i-o
[ghstack-poisoned]1 file changed
+7
-0
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
505 | 505 | | |
506 | 506 | | |
507 | 507 | | |
| 508 | + | |
508 | 509 | | |
| 510 | + | |
| 511 | + | |
| 512 | + | |
| 513 | + | |
| 514 | + | |
| 515 | + | |
509 | 516 | | |
510 | 517 | | |
511 | 518 | | |
| |||
0 commit comments