Fix reading `__cuda_array_interface__` without strides #24947

beam2d · 2019-08-21T02:41:00Z

When converting a contiguous CuPy ndarray to Tensor via __cuda_array_interface__, an error occurs due to incorrect handling of default strides. This PR fixes this problem. It makes torch.tensor(cupy_ndarray) works for contiguous inputs.

gchanan · 2019-08-23T21:47:15Z

can you provide a reproduction?

beam2d · 2019-08-28T06:17:20Z

OK, here is simple code to reproduce the error.

import cupy
import torch

a = cupy.ones(3)
torch.tensor(a)

This script gives the following error.

Traceback (most recent call last):
  File "foo.py", line 5, in <module>
    torch.tensor(a)
ValueError: given array strides not a multiple of the element byte size. Make a copy of the array to reallocate the memory.

leofang · 2019-10-01T15:16:41Z

@beam2d @gchanan @madsbk @asford The updated __cuda_array_interface__ protocol (v2) numba/numba#4609 is relevant to this PR.

madsbk · 2019-10-02T11:09:45Z

@beam2d @gchanan @madsbk @asford The updated __cuda_array_interface__ protocol (v2) numba/numba#4609 is relevant to this PR.

I don't see any problems with the new protocol but we should throw a more descriptive error message if data is zero: https://github.com/pytorch/pytorch/blob/b93fa6b970cc2e12cfeb8cd1a51dce2d95d797de/torch/csrc/utils/tensor_numpy.cpp#L273

However, the Numba integration test might fail because of version mismatch. I suggest that we remove or change line:

pytorch/test/test_numba_integration.py

Line 101 in f583f2e

self.assertEqual(ar_dict["version"], 1)

Finally, there is a typo here: https://github.com/pytorch/pytorch/blob/b93fa6b970cc2e12cfeb8cd1a51dce2d95d797de/torch/csrc/utils/tensor_numpy.cpp#L266
Should say:

throw TypeError("attribute `data` must exist");

leofang · 2019-10-02T14:50:55Z

I don't see any problems with the new protocol

@madsbk Note that in the new protocol, a non-existing keyword 'stride' or setting 'stride' to None is used to indicate the array is C-contiguous. So, this line should be changed:

pytorch/torch/tensor.py

Line 507 in 0b79f77

strides = tuple(s * itemsize for s in self.stride())

which currently always spells out the strides regardless of contiguity.

madsbk · 2019-10-02T16:16:59Z

@leofang right!
@beam2d can you include the protocol changes in this PR? or do you want me to make a new PR?

beam2d · 2019-10-03T05:51:04Z

Let me try to make it follow the new protocol.

leofang · 2019-10-09T20:34:13Z

btw, it might be necessary in tensor_from_cuda_array_interface() to reject any arrays coming with a mask (introduced in the v1 protocol)? Currently Numba and CuPy do that, but I know nothing about PyTorch so I could be wrong.

leofang · 2019-10-31T15:10:01Z

This PR is needed to work with Numba 0.46.0 and CuPy v7, see cupy/cupy#2589 (comment). Can someone take a look?

gchanan · 2019-10-31T22:38:38Z

is it possible to add a test?

gchanan · 2019-10-31T22:40:18Z

@leofang I commented over in numba/numba#4175 -- I think the standard is currently ambiguous when it wasn't before.

gchanan · 2019-10-31T22:50:26Z

torch/tensor.py


        shape = tuple(self.shape)
-        strides = tuple(s * itemsize for s in self.stride())
+        if self.is_contiguous():


it would be nice to comment whether this is an optimization or part of the protocol, but not strictly necessary.

Added a comment based on my current understanding, but it looks discussions are still going at numba/numba#4175, so I will adapt it after all reaching a consensus :).

gchanan · 2019-10-31T22:50:51Z

torch/tensor.py

+        if self.is_contiguous():
+            strides = None
+        else:
+            strides = tuple(s * itemsize for s in self.stride())


according to the protocol, we need to zero out data here if numel is 0.

I don't get it. What's wrong here? It looks fine to me.

the line below should be:
data = (self.data_ptr() if self.numel() > 0 else 0, False) # read-only is false

Oh, good catch @gchanan! (Knew nothing about numel(), just looked it up from the doc.) In fact I also caught this myself for CuPy just yesterday...

Thanks, I fixed it.

beam2d · 2019-11-01T01:07:52Z

is it possible to add a test?

I fixed the existing cuda array interface test. Or do you mean adding a test for a case not covered by existing ones (e.g., non-contiguous case)?

gchanan · 2019-11-05T22:58:36Z

I meant a direct test of cupy -- but if that's too difficult (dealing with dependencies and such), don't worry about it.

leofang

LGTM, thanks @beam2d!

Hi @gchanan, could you please press the green button to merge this PR? This is necessary for resolving cupy/cupy#2589. It has been a long-standing bug on the PyTorch side even before the protocol was updated to v2, so I suggest we merge this and move on. We can always revisit our joyful discussion later 🙂 Thank you.

leofang · 2019-12-05T18:35:01Z

@madsbk @gchanan @ezyang Pinging you guys in case this PR is left unattended. It's already approved. Can any of you merge this please?

ezyang

getting this in my queue

…rface-without-strides

facebook-github-bot

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

leofang · 2019-12-06T16:47:45Z

Thank you very much, @ezyang for taking care of this, @beam2d for fixing the bug, and everyone for participating in the discussion!

jakirkham · 2019-12-06T17:54:54Z

This is probably premature, but do we have a sense of when the next release will be? I'm guessing not for a while. Also a very rough answer would still be useful here. Thanks again for everyone's work on this issue. 🙂

soumith · 2019-12-06T19:35:09Z

@jakirkham v1.4.0 is aimed for Jan 7th

leofang · 2019-12-06T19:51:37Z

@soumith Thank you for sharing this info with us. I just requested to make this patch go into v1.4.0.

facebook-github-bot · 2019-12-10T11:21:45Z

@ezyang merged this pull request in 1d7b40f.

… test. This is a simpler fix than pytorch#24947, which both fixed the bug and updated the protocol version. This also adds a test (which the previous PR did not). So the plan is that master (1.5) will have the new protocol version (and a test), 1.4 will have the old protocol version and the test.

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 [ghstack-poisoned]

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 ghstack-source-id: ad6237c Pull Request resolved: #31451

… test. (#31450) This is a simpler fix than #24947, which both fixed the bug and updated the protocol version. This also adds a test (which the previous PR did not). So the plan is that master (1.5) will have the new protocol version (and a test), 1.4 will have the old protocol version and the test.

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 Differential Revision: [D19170020](https://our.internmc.facebook.com/intern/diff/D19170020) [ghstack-poisoned]

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 ghstack-source-id: 2d59503 Pull Request resolved: #31451

Summary: Pull Request resolved: #31451 The PR that fixed this, #24947, didn't add a test. Fixes: #31443 Test Plan: Imported from OSS Differential Revision: D19170020 Pulled By: gchanan fbshipit-source-id: bdbf09989ac8a61b1b70bb1ddee103caa8ef435b

Summary: When converting a contiguous CuPy ndarray to Tensor via `__cuda_array_interface__`, an error occurs due to incorrect handling of default strides. This PR fixes this problem. It makes `torch.tensor(cupy_ndarray)` works for contiguous inputs. Pull Request resolved: pytorch#24947 Differential Revision: D18838986 Pulled By: ezyang fbshipit-source-id: 2d827578f54ea22836037fe9ea8735b99f2efb42

) Summary: Pull Request resolved: pytorch#31451 The PR that fixed this, pytorch#24947, didn't add a test. Fixes: pytorch#31443 Test Plan: Imported from OSS Differential Revision: D19170020 Pulled By: gchanan fbshipit-source-id: bdbf09989ac8a61b1b70bb1ddee103caa8ef435b

pytorchbot added module: internals Related to internal abstractions in c10 and ATen module: numpy Related to numpy support, and also numpy compatibility of our operators labels Aug 21, 2019

soumith requested a review from gchanan August 23, 2019 03:59

ezyang added the open source label Sep 18, 2019

Fix reading __cuda_array_interface__ without strides

83bf18b

beam2d force-pushed the fix-cuda_array_interface-without-strides branch from b93fa6b to 73d64b1 Compare October 3, 2019 08:39

pytorchbot added module: numba module: operators labels Oct 3, 2019

Adapt to __cuda_array_interface__ v2

a4d747e

beam2d force-pushed the fix-cuda_array_interface-without-strides branch from 73d64b1 to a4d747e Compare October 3, 2019 08:40

zou3519 assigned gchanan Oct 10, 2019

zou3519 added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Oct 10, 2019

leofang mentioned this pull request Oct 31, 2019

CuPy to PyTorch with __cuda_array_interface__ - Array Strides not Multiple of Element Byte Size cupy/cupy#2589

Closed

gchanan mentioned this pull request Oct 31, 2019

__cuda_array_interface__ is not compliant with the standard numba/numba#4175

Closed

gchanan requested changes Oct 31, 2019

View reviewed changes

Add a comment on strides for __cuda_array_interface__

452b2ba

Set null data pointer when numel is zero

f2cf866

gchanan approved these changes Nov 7, 2019

View reviewed changes

leofang approved these changes Nov 21, 2019

View reviewed changes

ezyang self-requested a review December 5, 2019 21:33

ezyang approved these changes Dec 5, 2019

View reviewed changes

Merge remote-tracking branch 'origin/master' into fix-cuda_array_inte…

2ef3143

…rface-without-strides

facebook-github-bot reviewed Dec 5, 2019

View reviewed changes

facebook-github-bot closed this in 1d7b40f Dec 6, 2019

leofang mentioned this pull request Dec 6, 2019

[v1.4.0] Release Tracker #30331

Closed

leofang mentioned this pull request Dec 6, 2019

[v1.4.0] Fix reading __cuda_array_interface__ without strides #30901

Closed

facebook-github-bot added the merged label Dec 10, 2019

gchanan mentioned this pull request Dec 18, 2019

We should test __cuda_array_interface__ stride handling #31443

Closed

This was referenced Dec 18, 2019

[v1.4.0] Fix reading __cuda_array_interface__ inferred strides, add… #31450

Merged

Test reading __cuda_array_interface__ inferred strides. #31451

Closed

gchanan added a commit that referenced this pull request Dec 19, 2019

Test reading __cuda_array_interface__ inferred strides.

e398b16

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 [ghstack-poisoned]

gchanan added a commit that referenced this pull request Dec 19, 2019

Test reading __cuda_array_interface__ inferred strides.

e88787f

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 ghstack-source-id: ad6237c Pull Request resolved: #31451

gchanan added a commit that referenced this pull request Dec 20, 2019

Test reading __cuda_array_interface__ inferred strides.

7170481

The PR that fixed this, #24947, didn't add a test. Fixes: #31443 ghstack-source-id: 2d59503 Pull Request resolved: #31451

mruberry added the Merged label Oct 28, 2020

Fix reading __cuda_array_interface__ without strides #24947

Fix reading __cuda_array_interface__ without strides #24947

Uh oh!

Conversation

beam2d commented Aug 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gchanan commented Aug 23, 2019

Uh oh!

beam2d commented Aug 28, 2019

Uh oh!

leofang commented Oct 1, 2019

Uh oh!

madsbk commented Oct 2, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

leofang commented Oct 2, 2019

Uh oh!

madsbk commented Oct 2, 2019

Uh oh!

beam2d commented Oct 3, 2019

Uh oh!

leofang commented Oct 9, 2019

Uh oh!

leofang commented Oct 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gchanan commented Oct 31, 2019

Uh oh!

gchanan commented Oct 31, 2019

Uh oh!

gchanan Oct 31, 2019

Choose a reason for hiding this comment

Uh oh!

beam2d Nov 1, 2019

Choose a reason for hiding this comment

Uh oh!

gchanan Oct 31, 2019

Choose a reason for hiding this comment

Uh oh!

leofang Oct 31, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gchanan Nov 5, 2019

Choose a reason for hiding this comment

Uh oh!

leofang Nov 6, 2019

Choose a reason for hiding this comment

Uh oh!

beam2d Nov 7, 2019

Choose a reason for hiding this comment

Uh oh!

beam2d commented Nov 1, 2019

Uh oh!

gchanan commented Nov 5, 2019

Uh oh!

leofang left a comment

Choose a reason for hiding this comment

Uh oh!

leofang commented Dec 5, 2019

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

leofang commented Dec 6, 2019

Uh oh!

jakirkham commented Dec 6, 2019

Uh oh!

soumith commented Dec 6, 2019

Uh oh!

leofang commented Dec 6, 2019

Uh oh!

facebook-github-bot commented Dec 10, 2019

Uh oh!

Reviewers

Assignees

Fix reading `__cuda_array_interface__` without strides #24947

Fix reading `__cuda_array_interface__` without strides #24947

beam2d commented Aug 21, 2019 •

edited

Loading

madsbk commented Oct 2, 2019 •

edited

Loading

leofang commented Oct 31, 2019 •

edited

Loading

leofang Oct 31, 2019 •

edited

Loading