-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[BC-breaking] Fix bugs in torch::tensor constructor #28523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
|
Could you walk through how in the original code, |
This is what I think was what was going on in the previous code (please let me know if this is right or not): {1} becomes a InitListTensor of scalar 1 |
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still reading, have some initial questions
New features:
1. Previously, `torch::tensor({true, false, true})` throws `"tensor_cpu" not implemented for 'Bool'`. After this PR, it produces the correct bool tensor, matching the Python API behavior.
2. Tensors with zero-size dimensions are now supported, e.g. `torch::tensor({{}, {}})` produces a tensor with sizes `{2, 0}`, matching the Python API behavior.
Bug fixes:
1. Previously, `torch::tensor({{1}, {2}})` produces a tensor of sizes `{2}`. After this PR, it produces a tensor of sizes `{2, 1}`, matching the Python API behavior.
2. Fixed semantics of `torch::tensor(1.1)`: it now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.
BC-breaking changes:
1. `torch::tensor(1.1)` now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.
Notes:
1. From now on, the behavior of `at::tensor(scalar_value)` (which produces a 1-dim tensor) would be different from `torch::tensor(scalar_value)` (which produces a 0-dim tensor). I will fix the behavior of `at::tensor(scalar_value)` in the next PR.
The motivation comes from fixing the "`torch::tensor({{1}, {2}})` gives tensor of wrong sizes" bug - in order to fix it, I have to create the templated `TensorDataContainer` class and move the handling of `at::ArrayRef` and `std::vector` into it. After such changes, support for bool values comes out of the box without extra effort, and support for tensors with zero-size dimensions only requires adding a default constructor for `TensorDataContainer`, so I added those two in this PR.
For the semantic change of `torch::tensor(1.1)`, it's actually more effort to preserve the original wrong behavior (i.e. we need to explicitly instantiate `TensorDataContainer<1>` and let its scalar constructor create a 1-D tensor). I think preserving the original wrong behavior doesn't give us much value, and since the above changes naturally fix the problem, we should just start using the right behavior instead.
[ghstack-poisoned]
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
I made a mistake in my original assessment, and here is what actually happened instead:
|
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
| auto a = MyFunction::apply(torch::tensor(6, torch::requires_grad())); | ||
| auto b = Reenter::apply(torch::tensor(9, torch::requires_grad())); | ||
| auto a = MyFunction::apply(torch::tensor({6}, torch::dtype(torch::kFloat).requires_grad(true))); | ||
| auto b = Reenter::apply(torch::tensor({9}, torch::dtype(torch::kFloat).requires_grad(true))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed these tensors from 0-D to 1-D and explicitly passed the dtype, so that the original behavior of the tests are preserved.
| return {torch::tensor(static_cast<int64_t>(index)), | ||
| torch::tensor(static_cast<int64_t>(index))}; | ||
| return {torch::tensor({static_cast<int64_t>(index)}), | ||
| torch::tensor({static_cast<int64_t>(index)})}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed these tensors from 0-D to 1-D, so that the original behavior of the tests are preserved.
|
|
||
| TEST_F(FunctionalTest, SoftMarginLossDefaultOptions) { | ||
| auto input = torch::tensor({2., 4., 1., 3.}, torch::requires_grad()); | ||
| auto input = torch::tensor({2., 4., 1., 3.}, torch::dtype(torch::kFloat).requires_grad(true)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I explicitly passed the dtype to all torch::tensor calls that use options, so that the resulting tensor's dtype is always what we expect it to be.
| void reset() override {} | ||
| torch::Tensor forward(torch::Tensor input) { | ||
| return torch::tensor(input.device().index()); | ||
| return torch::tensor({input.device().index()}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed these tensors from 0-D to 1-D, so that the original behavior of the tests are preserved.
| at::kHalf, | ||
| scalar_type_, | ||
| "TensorDataContainer_pretty_print_tensor_item", [&] { | ||
| stream << tensor_[i].item<scalar_t>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does stream << tensor_[i].item() work without dispatching?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
item() requires a template type T so I think we will need to pass a scalar_t. I also tried stream << tensor_[i] but that prints
1.1
[ CPUFloatType{} ]
instead of 1.1, which is probably not nice for the pretty print.
zou3519
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The approach looks correct to me. I had some minor comments on cosmetics and testing
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
|
@zou3519 Thanks so much for the reviews! I address all of them and please feel free to look at it again when you have time :D |
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
CircleCI build failures summaryAs of commit ef6898b:
Here are the reasons each build failed. This comment was automatically generated by Dr. CI. Please report bugs/suggestions on the GitHub issue tracker. |
9bc36f9 to
ef6898b
Compare
Stack from ghstack:
New features:
torch::tensor({true, false, true})throws"tensor_cpu" not implemented for 'Bool'. After this PR, it produces the correct bool tensor, matching the Python API behavior.torch::tensor({{}, {}})produces a tensor with sizes{2, 0}, matching the Python API behavior.BC-breaking bug fixes and changes:
torch::tensor({{1}, {2}})produces a tensor of sizes{2}. After this PR, it produces a tensor of sizes{2, 1}, matching the Python API behavior.torch::tensor(1.1): it now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.std::initializer_list(NOT braced-init-list) totorch::tensordoesn't work anymore, and the user should pass the equivalent braced-init-list totorch::tensorinstead. For example,torch::tensor(std::initializer_list<double>({1.1, 1.2}))doesn't work anymore, and the user should dotorch::tensor({1.1, 1.2}). (The reason for this change is thatTensorDataContainercannot have two constructor that takesstd::initializer_list, otherwise a value such as{1.1, 1.2}would be ambiguous - it can take thestd::initializer_list<TensorDataContainer>constructor, or it can take thestd::initializer_list<double>constructor.)TensorOptionsto thetorch::tensorconstructor, it always produces a tensor of dtypefloat. After this PR, it produces tensor of different dtypes based on the dtype of the braced-init-list, matching the behavior of the no-options case.Notes:
at::tensor(scalar_value)(which produces a 1-dim tensor) would be different fromtorch::tensor(scalar_value)(which produces a 0-dim tensor). I will fix the behavior ofat::tensor(scalar_value)in a follow-up PR.at::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/))(which produces afloattensor) would be different fromtorch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/))(which produces a aninttensor). I will fix this behavior ofat::tensorconstructor in a follow-up PR.Context for the changes in this PR:
The motivation comes from fixing the "
torch::tensor({{1}, {2}})gives tensor of wrong sizes" bug - in order to fix it, I have to move the handling ofat::ArrayRefandstd::vectorintoInitListTensor(see below on why we need to do this) and renamedInitListTensortoTensorDataContainer. After such changes, support for bool values comes out of the box without extra effort, and support for tensors with zero-size dimensions only requires adding a default constructor forTensorDataContainer, so I added those two in this PR.For the semantic change of
torch::tensor(1.1), it's actually more effort to preserve the original wrong behavior (i.e. we need to check the sizes of the tensor converted fromTensorDataContainerand reshape any scalar tensor to a 1-D tensor). I think preserving the original wrong behavior doesn't give us much value, and since the above changes naturally fix the problem, we should just start using the right behavior instead.For the "constructor with non-dtype options behavior" fix, the code looks simpler and easier to reason about with the fix, so I included it in this PR.
Why we need to move the handling of
at::ArrayRefandstd::vectorintoTensorDataContainer:torch::tensor({{1}, {2}})can match this function overload:torch::tensor(at::ArrayRef<int> values), because{1}and{2}can be treated asa list-initialization of an
intvalue. However, this will produce a Tensor with sizes{2},but we actually want a Tensor with sizes
{2, 1}. In order to avoid matching this function overload,we removed the function overload and moved the ability to convert
at::ArrayRef<T>(and similarly
std::vector<T>) intoTensorDataContainer, and since for braced-init-list theTensorDataContainer(std::initializer_list<TensorDataContainer>)constructor is always preferred over all other constructors, it will take thestd::initializer_listpath, and all is good.Differential Revision: D18234625