Conversation
|
I had started with "could backport this", but I am not sure as it isn't trivial. Anyway, i still needs release notes and all that changes would be whether it we write 2.1 or 2.2 in a few places. |
| { | ||
| /* an exception may be in-flight, we must save it in case we create another one */ | ||
| PyObject *type, *value, *traceback; | ||
| PyErr_Fetch(&type, &value, &traceback); |
There was a problem hiding this comment.
This was always unnecessary, just missed that it was also here when removing it in the version above...
| } | ||
|
|
||
| // ensure alignment | ||
| int offset = sizeof(DLManagedTensor) % sizeof(void *); |
There was a problem hiding this comment.
This offset calculation was just wrong. It simply didn't matter because the struct size is a multiple of this anyway in practice.
numpy/_core/src/multiarray/dlpack.c
Outdated
| * can then be removed again. | ||
| */ | ||
| PyObject *res = create_dlpack_capsule( | ||
| self, major_version >= 1, &result_device); |
There was a problem hiding this comment.
Maybe to explain, the result_device is checked for validity already during argument parsing, so it's OK to just adopt it.
No matter what device we have, if NumPy supports it, we can re-export it as CPU.
|
I think the failing test is fixed in #26503 |
mattip
left a comment
There was a problem hiding this comment.
This could use a release note.
It seems the major change is supporting both DLManagedTensor and DLManagedTensorVersioned, but then I am confused since the DLManagedTensorVersioned struct already existed. Could you explain what is going on to help reviewers?
| * | ||
| * Note that as of Nov 2021, multiply libraries (CuPy, PyTorch, TensorFlow, | ||
| * TVM, perhaps others) do not adhere to this 256 byte alignment requirement | ||
| * TVM, perhaps others) do not adhere to this 256 byte aligment requirement |
There was a problem hiding this comment.
I changed it, but since it is vendored, I don't think it matters.
I supsect someone copied over a version of the header that already included it just to add |
mattip
left a comment
There was a problem hiding this comment.
Seems to work. It is a bit difficult to review because of the refactoring, but I think it all makes sense. I do wonder a bit about the GIL being needed in the various deleter functions since the they can be called by non-python code.
| // Taken from: | ||
| // https://github.com/dmlc/dlpack/blob/ca4d00ad3e2e0f410eeab3264d21b8a39397f362/include/dlpack/dlpack.h | ||
| // https://github.com/dmlc/dlpack/blob/bbd2f4d32427e548797929af08cfe2a9cbb3cf12/include/dlpack/dlpack.h | ||
| // but added typedef to DLManagedTensorVersioned |
There was a problem hiding this comment.
Probably, I mentioned it in an issue there.
| (DLManagedTensorVersioned *)PyCapsule_GetPointer( | ||
| self, NPY_DLPACK_VERSIONED_CAPSULE_NAME); | ||
| if (managed == NULL) { | ||
| PyErr_WriteUnraisable(self); |
There was a problem hiding this comment.
No, a capsule is a Python object, so it can't be deleted without the GIL. It is just the DLManagedTensor deleter that needs to grab the GIL.
| "Give a warning on reload and big warning in sub-interpreters."}, | ||
| {"from_dlpack", (PyCFunction)from_dlpack, | ||
| METH_O, NULL}, | ||
| METH_FASTCALL | METH_KEYWORDS, NULL}, |
|
|
||
| with pytest.raises(RuntimeError): | ||
| x.__dlpack__(stream=1) | ||
|
|
There was a problem hiding this comment.
Strange, yeah, seems like I copied it, but then didn't adapt it
There was a problem hiding this comment.
Fixed, copy= parsing is too forgiving, IMO, but I opened an issue on that. It doesn't matter here.
|
Forgot to set the |
|
Thanks @seberg |
This implements DLPack version 1 to close gh-26411.
Note that I had first tried to use C++ templating to avoid the duplication, but in the end the additional complexity of it didn't really seem worthwhile compared to duplicating the few functions.
Although a lot is just moved around, this is an unfortunately large diff. The new and old code paths should be (mostly) tested, but the nature of dlpack is that it is hard to test interaction with others (e.g. gpu failure paths).
I decided to always pass device/copy as
Noneto simplify the call sites.It does fix a few smaller issues as well.