-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Labels
module: rocmAMD GPU support for PytorchAMD GPU support for Pytorch
Description
🐛 Bug
On ROCm builds, converting a float64 tensor of gpu0 to a float32 tensor on gpu1 gives wrong results.
To Reproduce
float64_cuda0 = torch.ones(10, device='cuda:0', dtype=torch.float64)
float32_cuda1 = torch.zeros(10, device='cuda:1', dtype=torch.float32)
print("float64_cuda0", float64_cuda0)
print("float32_cuda1", float32_cuda1)
print("float64_cuda0.to(float32_cuda1)", float64_cuda0.to(float32_cuda1)) # BUG!
print("float64_cuda0.to(float32_cuda1.device)", float64_cuda0.to(float32_cuda1.device))
print("float64_cuda0.to(float32_cuda1.dtype)", float64_cuda0.to(float32_cuda1.dtype))
print("float64_cuda0.to(float32_cuda1.device, float32_cuda1.dtype)", float64_cuda0.to(float32_cuda1.device, float32_cuda1.dtype)) # BUG!
print("float64_cuda0.to(float32_cuda1.device).to(float32_cuda1.dtype)", float64_cuda0.to(float32_cuda1.device).to(float32_cuda1.dtype))
print("float64_cuda0.to(float32_cuda1.dtype).to(float32_cuda1.device)", float64_cuda0.to(float32_cuda1.dtype).to(float32_cuda1.device))Output:
15:53:37 ('float64_cuda0', tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:0'))
15:53:37 ('float32_cuda1', tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:1',
15:53:37 dtype=torch.float32))
15:53:37 ('float64_cuda0.to(float32_cuda1)', tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.], device='cuda:1',
15:53:37 dtype=torch.float32))
15:53:37 ('float64_cuda0.to(float32_cuda1.device)', tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:1'))
15:53:37 ('float64_cuda0.to(float32_cuda1.dtype)', tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:0',
15:53:37 dtype=torch.float32))
15:53:37 ('float64_cuda0.to(float32_cuda1.device, float32_cuda1.dtype)', tensor([0.0000, 1.8750, 0.0000, 1.8750, 0.0000, 1.8750, 0.0000, 1.8750, 0.0000,
15:53:37 1.8750], device='cuda:1', dtype=torch.float32))
15:53:37 ('float64_cuda0.to(float32_cuda1.device).to(float32_cuda1.dtype)', tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:1',
15:53:37 dtype=torch.float32))
15:53:37 ('float64_cuda0.to(float32_cuda1.dtype).to(float32_cuda1.device)', tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.], device='cuda:1',
15:53:37 dtype=torch.float32))
Additional context
I used this PR (#16431) to reproduce this bug. An example of the error log is at https://ci.pytorch.org/jenkins/job/pytorch-builds/job/py2-clang7-rocmdeb-ubuntu16.04-test/9093/console
Metadata
Metadata
Assignees
Labels
module: rocmAMD GPU support for PytorchAMD GPU support for Pytorch