Skip to content

Bug in conversion to mps with non_blocking=True #139550

@Xuzzo

Description

@Xuzzo

🐛 Describe the bug

The following code fails:

import torch

a = torch.tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

a = a.to("mps", non_blocking=True)
print(a)

Resulting tensor has random values here and there, something like

tensor([               6,                1,                2,                3,
                       4,                5,                6,                7,
                       8, 1688849860263945], device='mps:0')

Versions

PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.7.1 (arm64)
GCC version: Could not collect
Clang version: 16.0.0 (clang-1600.0.26.4)
CMake version: version 3.30.5
Libc version: N/A

Python version: 3.10.15 (main, Oct 3 2024, 02:24:49) [Clang 14.0.6 ] (64-bit runtime)
Python platform: macOS-14.7.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M3 Max

Versions of relevant libraries:
[pip3] numpy==1.26.4
[pip3] torch==2.5.1
[pip3] torchvision==0.20.1
[conda] numpy 1.26.4 pypi_0 pypi
[conda] torch 2.5.1 pypi_0 pypi
[conda] torchvision 0.20.1 pypi_0 pypi

cc @ezyang @gchanan @zou3519 @kadeng @msaroufim @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

Metadata

Metadata

Assignees

Labels

high prioritymodule: correctness (silent)issue that returns an incorrect result silentlymodule: mpsRelated to Apple Metal Performance Shaders frameworktriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions