-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Easy] make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) #11061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Easy] make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) #11061
Conversation
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@ezyang can I get a review on this easy PR? |
torch/csrc/utils/tensor_new.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read-only ops should not mutate.
|
I would also say that the semantics are slightly weird. The output variable depends on the data of the input variable, and so there's no reason not to differentiate through this... |
torch/csrc/utils/tensor_new.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@apaszke I would guess the copy constructed tensor should not be made differentiable through |
|
To figure out a good solution to this problem, we must ask ourselves:
We can see that In the bodies of Based on this, this suggests two solutions:
I think both approaches are reasonable. If you want to do (2), you'll need to write a custom You'll have something similar, except that it calls |
68e72fd to
04b32b3
Compare
|
@ezyang I am picking the (2) approach here: |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
torch/csrc/utils/tensor_new.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel there are some semantic problems.
Basically, in my opinion (with concurrences from @gchanan and @apaszke), we think that torch.tensor(x) should be equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) should be equivalent to x.clone().detach().requires_grad_(True). The basic idea is that torch.tensor reads out "the data" from whatever it is passed, and constructs a leaf variable. This gives us the invariant that "for any argument x, torch.tensor(x) does not require grad, and torch.tensor(x, requires_grad=True) gives a leaf variable that requires grad).
|
@ezyang Thanks a lot for the clarifications! Sorry I missed the default case. I will make changes so that: |
04b32b3 to
c040b6b
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
torch/csrc/utils/tensor_new.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/csrc/utils/tensor_new.cpp
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
test/test_torch.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
ezyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's still not right.
Also, can we update the docs for tensor for this case?
|
In general, we should discourage people from passing tensor to |
|
@ezyang Thanks a lot for the comments! I will update the PR to make copy construct from a tensor to be leaf variable as always, and set requires_grad wrt to input args. Also raise python warnings inside this path, and update the docs altogether. |
c040b6b to
02f1b22
Compare
02f1b22 to
6889608
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
torch/_torch_docs.py
Outdated
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…or, and set requires_grad wrt to input args, 2. raise warnings for this path and update docs
6889608 to
9c413bb
Compare
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
apaszke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if what we did here is correct.
| return copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) : | ||
| auto new_tensor = copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) : | ||
| new_with_type_conversion(type_to_use, var, device_index); | ||
| new_tensor.detach_(); // making copy constructed tensor a leaf node |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| When data is a tensor `x`, :func:`torch.tensor` reads out 'the data' from whatever it is passed, | ||
| and constructs a leaf variable. Therefore ``torch.tensor(x)`` is equivalent to ``x.clone().detach()`` | ||
| and ``torch.tensor(x, requires_grad=True)`` is equivalent to ``x.clone().detach().requires_grad_(True)``. | ||
| The equivalents use ``clone()`` and ``detach()`` are recommended. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
|
||
| if (THPVariable_Check(data)) { | ||
| PyErr_WarnEx(PyExc_UserWarning, | ||
| "To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() " |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| type_inference) | ||
| type_inference, | ||
| args_requires_grad) | ||
| .set_requires_grad(r.toBool(3)); |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| return copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) : | ||
| auto new_tensor = copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) : | ||
| new_with_type_conversion(type_to_use, var, device_index); | ||
| new_tensor.detach_(); // making copy constructed tensor a leaf node |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
Summary: - fix PR pytorch#11061 by moving `detach_()` and `set_requires_grad()` to `torch.tensor_ctor()` and `tensor.new_tensor`, and also removed warnings and `args_requires_grad` from `internal_new_from_data ` - with this patch, the returned tensor from `tensor_ctor()` and `new_tensor` will be detached from source tensor, and set requires_grad based on the input args - `torch.as_tensor` retains its behavior as documented gchanan apaszke Pull Request resolved: pytorch#11815 Differential Revision: D9932713 Pulled By: weiyangfb fbshipit-source-id: 4290cbc57bd449954faadc597c24169a7b2d8259
Summary: - fix PR pytorch#11061 by moving `detach_()` and `set_requires_grad()` to `torch.tensor_ctor()` and `tensor.new_tensor`, and also removed warnings and `args_requires_grad` from `internal_new_from_data ` - with this patch, the returned tensor from `tensor_ctor()` and `new_tensor` will be detached from source tensor, and set requires_grad based on the input args - `torch.as_tensor` retains its behavior as documented gchanan apaszke Pull Request resolved: pytorch#11815 Differential Revision: D9932713 Pulled By: weiyangfb fbshipit-source-id: 4290cbc57bd449954faadc597c24169a7b2d8259
Summary: The earlier tests had around 80 warnings, and now there are 6 warnings: these are due to JIT The changes remove the wrapping of a Tensor by a Tensor constructor, which emits warnings due to the changes in #11061 . Pull Request resolved: #12038 Differential Revision: D10033392 Pulled By: apaszke fbshipit-source-id: b1faf368e650d062d7983f9932511bee4702a893
Uh oh!
There was an error while loading. Please reload this page.