Skip to content

Conversation

@yf225
Copy link
Contributor

@yf225 yf225 commented Oct 23, 2019

Stack from ghstack:

New features:

  1. Previously, torch::tensor({true, false, true}) throws "tensor_cpu" not implemented for 'Bool'. After this PR, it produces the correct bool tensor, matching the Python API behavior.
  2. Tensors with zero-size dimensions are now supported, e.g. torch::tensor({{}, {}}) produces a tensor with sizes {2, 0}, matching the Python API behavior.

BC-breaking bug fixes and changes:

  1. Previously, torch::tensor({{1}, {2}}) produces a tensor of sizes {2}. After this PR, it produces a tensor of sizes {2, 1}, matching the Python API behavior.
  2. Fixed semantics of torch::tensor(1.1): it now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.
  3. After this PR, passing std::initializer_list (NOT braced-init-list) to torch::tensor doesn't work anymore, and the user should pass the equivalent braced-init-list to torch::tensor instead. For example, torch::tensor(std::initializer_list<double>({1.1, 1.2})) doesn't work anymore, and the user should do torch::tensor({1.1, 1.2}). (The reason for this change is that TensorDataContainer cannot have two constructor that takes std::initializer_list, otherwise a value such as {1.1, 1.2} would be ambiguous - it can take the std::initializer_list<TensorDataContainer> constructor, or it can take the std::initializer_list<double> constructor.)
  4. Previously, when passed a non-dtype TensorOptions to the torch::tensor constructor, it always produces a tensor of dtype float. After this PR, it produces tensor of different dtypes based on the dtype of the braced-init-list, matching the behavior of the no-options case.
// Previously:
torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({{1, 2, 3}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({1., 2., 3.}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float
torch::tensor({{1., 2., 3.}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> float

// Now:
torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> int
torch::tensor({{1, 2, 3}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> int
torch::tensor({1., 2., 3.}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> double
torch::tensor({{1., 2., 3.}}, torch::TensorOptions(/*non-dtype-options*/)).dtype() -> double

// As comparison, currently:
torch::tensor({1, 2, 3}).dtype() -> int
torch::tensor({{1, 2, 3}}).dtype() -> int
torch::tensor({1., 2., 3.}).dtype() -> double
torch::tensor({{1., 2., 3.}}).dtype() -> double

Notes:

  1. From now on, the behavior of at::tensor(scalar_value) (which produces a 1-dim tensor) would be different from torch::tensor(scalar_value) (which produces a 0-dim tensor). I will fix the behavior of at::tensor(scalar_value) in a follow-up PR.
  2. From now on, the behavior of at::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)) (which produces a float tensor) would be different from torch::tensor({1, 2, 3}, torch::TensorOptions(/*non-dtype-options*/)) (which produces a an int tensor). I will fix this behavior of at::tensor constructor in a follow-up PR.

Context for the changes in this PR:

The motivation comes from fixing the "torch::tensor({{1}, {2}}) gives tensor of wrong sizes" bug - in order to fix it, I have to move the handling of at::ArrayRef and std::vector into InitListTensor (see below on why we need to do this) and renamed InitListTensor to TensorDataContainer. After such changes, support for bool values comes out of the box without extra effort, and support for tensors with zero-size dimensions only requires adding a default constructor for TensorDataContainer, so I added those two in this PR.

For the semantic change of torch::tensor(1.1), it's actually more effort to preserve the original wrong behavior (i.e. we need to check the sizes of the tensor converted from TensorDataContainer and reshape any scalar tensor to a 1-D tensor). I think preserving the original wrong behavior doesn't give us much value, and since the above changes naturally fix the problem, we should just start using the right behavior instead.

For the "constructor with non-dtype options behavior" fix, the code looks simpler and easier to reason about with the fix, so I included it in this PR.


Why we need to move the handling of at::ArrayRef and std::vector into TensorDataContainer:

torch::tensor({{1}, {2}}) can match this function overload:
torch::tensor(at::ArrayRef<int> values), because {1} and {2} can be treated as
a list-initialization of an int value. However, this will produce a Tensor with sizes {2},
but we actually want a Tensor with sizes {2, 1}. In order to avoid matching this function overload,
we removed the function overload and moved the ability to convert at::ArrayRef<T>
(and similarly std::vector<T>) into TensorDataContainer, and since for braced-init-list the
TensorDataContainer(std::initializer_list<TensorDataContainer>) constructor is always preferred over all other constructors, it will take the std::initializer_list path, and all is good.

Differential Revision: D18234625

@yf225 yf225 added the module: bc-breaking Related to a BC-breaking change label Oct 23, 2019
@yf225 yf225 changed the title Fix bugs in torch::tensor constructor [BC-breaking] Fix bugs in torch::tensor constructor Oct 23, 2019
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
@yf225 yf225 requested a review from zou3519 October 23, 2019 20:48
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
@zou3519
Copy link
Contributor

zou3519 commented Oct 25, 2019

Could you walk through how in the original code, torch::tensor({{1}, {2}}) produced a tensor of size {2}?

@zou3519
Copy link
Contributor

zou3519 commented Oct 25, 2019

Could you walk through how in the original code, torch::tensor({{1}, {2}}) produced a tensor of size {2}?

This is what I think was what was going on in the previous code (please let me know if this is right or not):

{1} becomes a InitListTensor of scalar 1
{2} becomes an InitListTensor of scalar 2
{ InitListTensor(1), InitListTensor(2) } becomes an InitListTensor with sizes [2]

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still reading, have some initial questions

yf225 added 7 commits October 26, 2019 01:15
New features:
1. Previously, `torch::tensor({true, false, true})` throws `"tensor_cpu" not implemented for 'Bool'`. After this PR, it produces the correct bool tensor, matching the Python API behavior.
2. Tensors with zero-size dimensions are now supported, e.g. `torch::tensor({{}, {}})` produces a tensor with sizes `{2, 0}`, matching the Python API behavior.

Bug fixes:
1. Previously, `torch::tensor({{1}, {2}})` produces a tensor of sizes `{2}`. After this PR, it produces a tensor of sizes `{2, 1}`, matching the Python API behavior.
2. Fixed semantics of `torch::tensor(1.1)`: it now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.

BC-breaking changes:
1. `torch::tensor(1.1)` now returns a 0-dim tensor instead of a 1-dim tensor, matching the Python API behavior.

Notes:
1. From now on, the behavior of `at::tensor(scalar_value)` (which produces a 1-dim tensor) would be different from `torch::tensor(scalar_value)` (which produces a 0-dim tensor). I will fix the behavior of `at::tensor(scalar_value)` in the next PR.

The motivation comes from fixing the "`torch::tensor({{1}, {2}})` gives tensor of wrong sizes" bug - in order to fix it, I have to create the templated `TensorDataContainer` class and move the handling of `at::ArrayRef` and `std::vector` into it. After such changes, support for bool values comes out of the box without extra effort, and support for tensors with zero-size dimensions only requires adding a default constructor for `TensorDataContainer`, so I added those two in this PR.  

For the semantic change of `torch::tensor(1.1)`, it's actually more effort to preserve the original wrong behavior (i.e. we need to explicitly instantiate `TensorDataContainer<1>` and let its scalar constructor create a 1-D tensor). I think preserving the original wrong behavior doesn't give us much value, and since the above changes naturally fix the problem, we should just start using the right behavior instead.

[ghstack-poisoned]
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
@yf225 yf225 added the module: cpp Related to C++ API label Oct 28, 2019
@yf225
Copy link
Contributor Author

yf225 commented Oct 28, 2019

Could you walk through how in the original code, torch::tensor({{1}, {2}}) produced a tensor of size {2}?

This is what I think was what was going on in the previous code (please let me know if this is right or not):

{1} becomes a InitListTensor of scalar 1
{2} becomes an InitListTensor of scalar 2
{ InitListTensor(1), InitListTensor(2) } becomes an InitListTensor with sizes [2]

I made a mistake in my original assessment, and here is what actually happened instead:

torch::tensor({{1}, {2}}) matches a previously existing function overload: torch::tensor(at::ArrayRef<int> values), because the compiler looks at {1} and {2} as
list-initializations of int values, and treat torch::tensor({{1}, {2}}) the same as torch::tensor({1, 2}), which produces a Tensor with sizes {2}.

Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
@yf225 yf225 requested a review from zou3519 October 28, 2019 22:01
auto a = MyFunction::apply(torch::tensor(6, torch::requires_grad()));
auto b = Reenter::apply(torch::tensor(9, torch::requires_grad()));
auto a = MyFunction::apply(torch::tensor({6}, torch::dtype(torch::kFloat).requires_grad(true)));
auto b = Reenter::apply(torch::tensor({9}, torch::dtype(torch::kFloat).requires_grad(true)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these tensors from 0-D to 1-D and explicitly passed the dtype, so that the original behavior of the tests are preserved.

return {torch::tensor(static_cast<int64_t>(index)),
torch::tensor(static_cast<int64_t>(index))};
return {torch::tensor({static_cast<int64_t>(index)}),
torch::tensor({static_cast<int64_t>(index)})};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these tensors from 0-D to 1-D, so that the original behavior of the tests are preserved.


TEST_F(FunctionalTest, SoftMarginLossDefaultOptions) {
auto input = torch::tensor({2., 4., 1., 3.}, torch::requires_grad());
auto input = torch::tensor({2., 4., 1., 3.}, torch::dtype(torch::kFloat).requires_grad(true));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explicitly passed the dtype to all torch::tensor calls that use options, so that the resulting tensor's dtype is always what we expect it to be.

void reset() override {}
torch::Tensor forward(torch::Tensor input) {
return torch::tensor(input.device().index());
return torch::tensor({input.device().index()});
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed these tensors from 0-D to 1-D, so that the original behavior of the tests are preserved.

at::kHalf,
scalar_type_,
"TensorDataContainer_pretty_print_tensor_item", [&] {
stream << tensor_[i].item<scalar_t>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does stream << tensor_[i].item() work without dispatching?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

item() requires a template type T so I think we will need to pass a scalar_t. I also tried stream << tensor_[i] but that prints

1.1
[ CPUFloatType{} ]

instead of 1.1, which is probably not nice for the pretty print.

Copy link
Contributor

@zou3519 zou3519 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach looks correct to me. I had some minor comments on cosmetics and testing

Will Feng added 4 commits October 30, 2019 16:07
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
@yf225
Copy link
Contributor Author

yf225 commented Oct 30, 2019

@zou3519 Thanks so much for the reviews! I address all of them and please feel free to look at it again when you have time :D

Will Feng added 2 commits October 30, 2019 17:54
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Will Feng added 6 commits October 30, 2019 18:43
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
Fix bugs in torch::tensor constructor

gh-metadata: pytorch pytorch 28523 gh/yf225/16/head
@kostmo
Copy link
Member

kostmo commented Oct 31, 2019

CircleCI build failures summary

As of commit ef6898b:

  • 0/2 flaky

Here are the reasons each build failed.


This comment was automatically generated by Dr. CI.
Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions on the GitHub issue tracker.

@facebook-github-bot
Copy link
Contributor

@yf225 merged this pull request in 595209b.

@facebook-github-bot facebook-github-bot deleted the gh/yf225/16/head branch November 4, 2019 15:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: bc-breaking Related to a BC-breaking change module: cpp Related to C++ API

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants