-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add Tensor._make_wrapper_subclass #65340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
I thought about a few possible ways of doing this. The main hazard is that if I create a CPU tensor that doesn't have any real storage, the moment I actually try to access the data on the tensor I will segfault. So I don't want to use _make_subclass on a "cpu meta tensor" because the CPU meta tensor (with no subclass) is radioactive: printing it will immediately cause a segfault. So instead, I have to create the CPU meta tensor AND subclass all in one go, and that means I need another function for it. One downside to doing it this way is I need another overload for explicit strides, and in general it is difficult to get the view relationships to all work out properly; tracked at #65339 Fixes #62972 Fixes #62730 Signed-off-by: Edward Z. Yang <[email protected]> [ghstack-poisoned]
CI Flow Status⚛️ CI FlowRuleset - Version:
You can add a comment to the PR and tag @pytorchbot with the following commands: # ciflow rerun, "ciflow/default" will always be added automatically
@pytorchbot ciflow rerun
# ciflow rerun with additional labels "-l <ciflow/label_name>", which is equivalent to adding these labels manually and trigger the rerun
@pytorchbot ciflow rerun -l ciflow/scheduled -l ciflow/slowFor more information, please take a look at the CI Flow Wiki. |
🔗 Helpful links
💊 CI failures summary and remediationsAs of commit 48c76f9 (more details on the Dr. CI page):
🕵️ 2 new failures recognized by patternsThe following CI failures do not appear to be due to upstream breakages:
|
I thought about a few possible ways of doing this. The main hazard is that if I create a CPU tensor that doesn't have any real storage, the moment I actually try to access the data on the tensor I will segfault. So I don't want to use _make_subclass on a "cpu meta tensor" because the CPU meta tensor (with no subclass) is radioactive: printing it will immediately cause a segfault. So instead, I have to create the CPU meta tensor AND subclass all in one go, and that means I need another function for it. One downside to doing it this way is I need another overload for explicit strides, and in general it is difficult to get the view relationships to all work out properly; tracked at #65339 Fixes #62972 Fixes #62730 Signed-off-by: Edward Z. Yang <ezyangfb.com> [ghstack-poisoned]
I thought about a few possible ways of doing this. The main hazard is that if I create a CPU tensor that doesn't have any real storage, the moment I actually try to access the data on the tensor I will segfault. So I don't want to use _make_subclass on a "cpu meta tensor" because the CPU meta tensor (with no subclass) is radioactive: printing it will immediately cause a segfault. So instead, I have to create the CPU meta tensor AND subclass all in one go, and that means I need another function for it. One downside to doing it this way is I need another overload for explicit strides, and in general it is difficult to get the view relationships to all work out properly; tracked at #65339 Fixes #62972 Fixes #62730 Signed-off-by: Edward Z. Yang <ezyangfb.com> ghstack-source-id: a6a9b11 Pull Request resolved: #65340
|
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the properties from the Tensor created by for_blob?
Can we make sure that we just raise an error if we go below the Python key with it?
| }); | ||
| ParsedArgs<8> parsed_args{}; | ||
| auto r = parser.parse(args, kwargs, parsed_args); | ||
| // TODO: handle has_torch_function, maybe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no Tensor input here right?
That would be for Mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Also, if there's a hypothetical version of this which takes a Tensor and reads attributes directly from the Tensor maybe this should also be overrideable? I am not exactly sure what the use case for any of this is, it is mostly an argument from symmetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess technically torch function isn't getting a mode so we can't override this anyway lol
|
|
||
| // don't bother releasing GIL here, as we are not allocating any nontrivial | ||
| // data | ||
| auto data = at::for_blob(nullptr, r.intlist(1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any doc anywhere about what for_blob is? That sounds very interesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes go read ATen/templates/Functions.h
It's non-resizable (this may be a mistake, but we can fix it later), the data pointer is null, and we have no deleter for it. Or is there something else you're thinking of?
Not easily. We could add another dispatch key below Python to catch such cases (if I had spare dispatch keys), but I can't straightforwardly remove the CPU dispatch key as this will cause certain type tests to work incorrectly. |
Sounds good! |
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
| # The wrapping tensor (LoggingTensor) shouldn't hold any | ||
| # memory for the class in question, but it should still | ||
| # advertise the same device as before | ||
| r = torch.Tensor._make_wrapper_subclass( | ||
| cls, elem.size(), | ||
| # TODO: clone strides and storage aliasing | ||
| dtype=elem.dtype, layout=elem.layout, | ||
| device=elem.device, requires_grad=elem.requires_grad | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if someone calls .storage() on a LoggingTensor now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You get a storage, but its data pointer is a null pointer and alas, you can probably induce a segfault this way. Good catch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed it segfaults
>>> class A(torch.Tensor):
... def __torch_dispatch__(): pass
...
>>> torch.Tensor._make_wrapper_subclass(A, (2,)).storage()[0]
Segmentation fault (core dumped)
I checked if meta tensors have this problem and they just error out when returning their storage, so I'll just do that here as a patch up
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is cool! Will follow along with strides and view relationship handling (the lack of view relationship handling has been causing internal asserts with functorch on debug builds)
| END_HANDLE_TH_ERRORS | ||
| } | ||
|
|
||
| static PyObject* THPVariable_make_wrapper_subclass(PyObject* _ignored, PyObject* args, PyObject* kwargs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit. Write it as follows to avoid unused variable warnings
| static PyObject* THPVariable_make_wrapper_subclass(PyObject* _ignored, PyObject* args, PyObject* kwargs) { | |
| static PyObject* THPVariable_make_wrapper_subclass(PyObject* , PyObject* args, PyObject* kwargs) { |
I thought about a few possible ways of doing this. The main hazard is that if I create a CPU tensor that doesn't have any real storage, the moment I actually try to access the data on the tensor I will segfault. So I don't want to use _make_subclass on a "cpu meta tensor" because the CPU meta tensor (with no subclass) is radioactive: printing it will immediately cause a segfault. So instead, I have to create the CPU meta tensor AND subclass all in one go, and that means I need another function for it. One downside to doing it this way is I need another overload for explicit strides, and in general it is difficult to get the view relationships to all work out properly; tracked at #65339 Fixes #62972 Fixes #62730 Signed-off-by: Edward Z. Yang <ezyangfb.com> Differential Revision: [D31057231](https://our.internmc.facebook.com/intern/diff/D31057231) [ghstack-poisoned]
I thought about a few possible ways of doing this. The main hazard is that if I create a CPU tensor that doesn't have any real storage, the moment I actually try to access the data on the tensor I will segfault. So I don't want to use _make_subclass on a "cpu meta tensor" because the CPU meta tensor (with no subclass) is radioactive: printing it will immediately cause a segfault. So instead, I have to create the CPU meta tensor AND subclass all in one go, and that means I need another function for it. One downside to doing it this way is I need another overload for explicit strides, and in general it is difficult to get the view relationships to all work out properly; tracked at #65339 Fixes #62972 Fixes #62730 Signed-off-by: Edward Z. Yang <ezyangfb.com> ghstack-source-id: dd257c9 Pull Request resolved: #65340
|
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
| if (storage.device_type() == at::DeviceType::Meta) { | ||
| TORCH_CHECK_NOT_IMPLEMENTED(false, "python bindings for meta storage objects not supported"); | ||
| } | ||
| if (storage.data() == nullptr && storage.nbytes() != 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could instead set set_storage_access_should_throw
Lines 2342 to 2344 in 760aefd
| void set_storage_access_should_throw() { | |
| storage_access_should_throw_ = true; | |
| } |
THPVariable_make_wrapper_subclass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, per #65339 in the long term we do want storage represented adequately
Stack from ghstack:
I thought about a few possible ways of doing this. The main hazard is
that if I create a CPU tensor that doesn't have any real storage, the
moment I actually try to access the data on the tensor I will segfault.
So I don't want to use _make_subclass on a "cpu meta tensor" because
the CPU meta tensor (with no subclass) is radioactive: printing it
will immediately cause a segfault. So instead, I have to create
the CPU meta tensor AND subclass all in one go, and that means I need
another function for it. One downside to doing it this way is
I need another overload for explicit strides, and in general it is
difficult to get the view relationships to all work out properly;
tracked at #65339
Fixes #62972
Fixes #62730
Signed-off-by: Edward Z. Yang [email protected]
Differential Revision: D31057231