[MPS] `F.interpolate` with `mode="nearest-exact"` differs from expected output

### 🐛 Describe the bug

This was discovered while annotating failures in https://github.com/pytorch/pytorch/pull/134184. Original test case is in [`test/test_nn.py`](https://github.com/pytorch/pytorch/blob/7af38eb98bdceb8fc6f8635ed7dd664ef44e4b10/test/test_nn.py#L9323).

There's a diff between the results from MPS and CPU.

The algorithm for correctly computing the expected output in 1d can be seen in https://github.com/pytorch/pytorch/blob/7af38eb98bdceb8fc6f8635ed7dd664ef44e4b10/test/test_nn.py#L9330-L9336
and 2d in https://github.com/pytorch/pytorch/blob/7af38eb98bdceb8fc6f8635ed7dd664ef44e4b10/test/test_nn.py#L9438-L9447

See also https://github.com/pytorch/pytorch/issues/34808.

minimal repro:

```python
import torch
import torch.nn.functional as F

isize, osize = (20, 11)

in_cpu = torch.arange(isize, dtype=torch.float, device='cpu').unsqueeze(0).unsqueeze(0)
in_mps = torch.arange(isize, dtype=torch.float, device='mps').unsqueeze(0).unsqueeze(0)
out_cpu = F.interpolate(in_cpu, size=(osize, ), recompute_scale_factor=False, mode="nearest-exact")
out_mps = F.interpolate(in_mps, size=(osize, ), recompute_scale_factor=False, mode="nearest-exact")

out_cpu - out_mps.cpu()
# tensor([[[0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]]])
```

### Versions

PyTorch version: 2.5.0a0+gitc19005d
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 14.6.1 (arm64)
GCC version: Could not collect
Clang version: 15.0.0 (clang-1500.3.9.4)
CMake version: version 3.30.1
Libc version: N/A

Python version: 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:13:44) [Clang 16.0.6 ] (64-bit runtime)
Python platform: macOS-14.6.1-arm64-arm-64bit
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Apple M3 Max

Versions of relevant libraries:
[pip3] flake8==6.1.0
[pip3] flake8-bugbear==23.3.23
[pip3] flake8-comprehensions==3.15.0
[pip3] flake8-executable==2.1.3
[pip3] flake8-logging-format==0.9.0
[pip3] flake8-pyi==23.3.1
[pip3] flake8-simplify==0.19.3
[pip3] mypy==1.10.0
[pip3] mypy-extensions==1.0.0
[pip3] numpy==2.0.1
[pip3] optree==0.12.1
[pip3] torch==2.5.0a0+gitc19005d
[pip3] torch-tb-profiler==0.4.3
[pip3] torchvision==0.20.0a0+0d80848
[pip3] triton==3.0.0
[conda] numpy                     2.0.1                    pypi_0    pypi
[conda] optree                    0.12.1                   pypi_0    pypi
[conda] torch                     2.5.0a0+gitc19005d           dev_0    <develop>
[conda] torch-tb-profiler         0.4.3                    pypi_0    pypi
[conda] torchfix                  0.4.0                    pypi_0    pypi
[conda] torchvision               0.20.0a0+0d80848           dev_0    <develop>
[conda] triton                    3.0.0                    pypi_0    pypi

cc @kulinseth @albanD @malfet @DenisVieriu97 @jhavukainen

	# compute expected output as scikit-image/scipy
	expected_out = torch.zeros(osize, dtype=torch.float).unsqueeze(0).unsqueeze(0)
	scale = 1.0 * isize / osize
	for o in range(osize):
	i_f32 = (o + 0.5) * scale
	i = int(i_f32)
	expected_out[0, 0, o] = in_t[0, 0, i]

	# compute expected output as Scikit-Image/Scipy
	expected_out = torch.zeros(1, 1, osize, osize, dtype=torch.float)
	scale = 1.0 * isize / osize
	for o1 in range(osize):
	i1_f32 = (o1 + 0.5) * scale
	i1 = int(i1_f32)
	for o2 in range(osize):
	i2_f32 = (o2 + 0.5) * scale
	i2 = int(i2_f32)
	expected_out[0, 0, o1, o2] = in_t[0, 0, i1, i2]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[MPS] `F.interpolate` with `mode="nearest-exact"` differs from expected output #134430

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[MPS] F.interpolate with mode="nearest-exact" differs from expected output #134430

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[MPS] `F.interpolate` with `mode="nearest-exact"` differs from expected output #134430