-
Notifications
You must be signed in to change notification settings - Fork 568
How to call tensor.item() on single proc after a collective op ? #2225
Copy link
Copy link
Closed
Labels
staleHas not had recent activityHas not had recent activity
Description
❓ Questions and Help
Hi, I'm trying to log tensor value by calling res[0].item() in a single process after all_reduce on this tensor. Execution seems to hang.
To reproduce:
import torch
import torch_xla.core.xla_model as xm
import torch_xla.distributed.xla_multiprocessing as xmp
def test_tensor_item(index):
xm.rendezvous('init')
print(index, "test_tensor_item")
device = xm.xla_device()
rank = xm.get_ordinal()
t = torch.tensor([rank + 0.0, rank + 1.0, rank + 2.0], device=device)
res = xm.all_reduce("sum", t)
print(index, res, flush=True)
xm.rendezvous('sync')
if index == 0:
print(index, res[0].item(), flush=True)
xmp.spawn(test_tensor_item, args=(), nprocs=8, start_method='fork')Any hints, please.
Thanks
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
staleHas not had recent activityHas not had recent activity