Auto-convert GPU arrays that support the __cuda_array_interface__ protocol #20584

madsbk · 2019-05-16T14:01:04Z

This PR implements auto-conversion of GPU arrays that support the __cuda_array_interface__ protocol (fixes #15601).

If an object exposes the __cuda_array_interface__ attribute, touch.as_tensor() and touch.tensor() will use the exposed device memory.

Zero-copy

When using touch.as_tensor(...,device=D) where D is the same device as the one used in __cuda_array_interface__.

Implicit copy

When using touch.as_tensor(...,device=D) where D is the CPU or another non-CUDA device.

Explicit copy

When using torch.tensor().

Exception

When using touch.as_tensor(...,device=D) where D is a CUDA device not used in __cuda_array_interface__.

Lifetime

torch.as_tensor(obj) tensor grabs a reference to obj so that the lifetime of obj exceeds the tensor

test/test_numba_integration.py

torch/csrc/utils/tensor_new.cpp

torch/csrc/utils/tensor_numpy.cpp

ezyang

Logically, the code looks great! However, I would like the memory leaks to be fixed before merge. If you insist, just manually inserting the necessary decrefs is acceptable, however, I think using an RAII class from pybind11 will be much safer.

madsbk · 2019-05-17T14:44:37Z

Thanks, I will use pybind11 to fix the memory leaks and also address the other issues.

Now we only use `PyArray_DescrConverter()`

ezyang · 2019-05-20T13:59:21Z

torch/csrc/utils/tensor_numpy.cpp

+    if(!PyTuple_Check(py_data) || PyTuple_GET_SIZE(py_data) != 2) {
+      throw TypeError("`data` must be a 2-tuple of (int, bool)");
+    }
+    PyTuple_GET_ITEM(py_data, 0);


I don't think you intended to have this line?

ezyang

Will merge when tests pass.

facebook-github-bot

@ezyang is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-05-21T23:42:30Z

@ezyang merged this pull request in 5d8879c.

mrocklin · 2019-05-26T15:15:30Z

FYI @seibert, thought you'd like to know that PyTorch now supports __cuda_array_interface__ in both directions

madsbk added 3 commits May 16, 2019 15:51

Implemented tensor_from_cuda_array_interface()

0a1c1b2

internal_new_from_data() now supports __cuda_array_interface__

b9c162f

Added testing of the __cuda_array_interface__ protocol

55bf44a

pytorchbot added module: internals Related to internal abstractions in c10 and ATen module: numpy Related to numpy support, and also numpy compatibility of our operators module: numba labels May 16, 2019

mrocklin reviewed May 16, 2019

View reviewed changes

test/test_numba_integration.py Show resolved Hide resolved

li-roy added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label May 16, 2019

madsbk added 3 commits May 17, 2019 09:14

fixed name clash in test

d548c02

fixed compile error on windows. Where using "and" and "or".

43877bf

Added a lifetime test

4742f91