[Easy] make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) #11061

weiyangfb · 2018-08-30T02:55:05Z

fix [pytorch] requires_grad=False from torch.tensor ignored when the input tensor has requires_grad=True #10876
the cause of the bug is because copy constructor cannot distinguish between default value of requires_grad and requires_grad=False, thus it makes a copy from source tensor along with its grad_fn if requires_grad=True at source
with this fix, the behavior becomes

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source, requires_grad=True)
>>> print(copy)
tensor([[-1.2001,  1.9869],
        [-1.0134,  1.3096]], grad_fn=<CopyBackwards>)

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source, requires_grad=False)
>>> print(copy)
tensor([[-0.7402,  0.0467],
        [ 0.4344, -0.0420]])

>>> source = torch.randn(2, 2, requires_grad=True)
>>> copy = torch.tensor(source)
>>> print(copy)
tensor([[-0.7402,  0.0467],
        [ 0.4344, -0.0420]])

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

weiyangfb · 2018-09-11T16:43:49Z

@ezyang can I get a review on this easy PR?

torch/csrc/utils/tensor_new.cpp

ezyang

Read-only ops should not mutate.

apaszke · 2018-09-12T14:32:12Z

I would also say that the semantics are slightly weird. The output variable depends on the data of the input variable, and so there's no reason not to differentiate through this...

torch/csrc/utils/tensor_new.cpp

weiyangfb · 2018-09-12T20:58:53Z

@apaszke I would guess the copy constructed tensor should not be made differentiable through copy op here. Please correct me if I am wrong, I think the expected behavior should be the same as:

source = torch.randn(10, requires_grad=True)
copy = torch.tensor(source.data)

ezyang · 2018-09-14T19:18:26Z

To figure out a good solution to this problem, we must ask ourselves:

How is tensor() with a Tensor argument implemented today, and
Why does this implementation "do the wrong thing" with respect to requires_grad?

We can see that tensor() dispatches to tensor_ctor, which dispatches to internal_new_from_data. Here is the important lines in the function:

  if (THPVariable_Check(data)) {
      auto var = reinterpret_cast<THPVariable*>(data)->cdata;
      auto type_inference_device_type = device_opt.has_value() ? device_opt->type()
                                                               : torch::getDeviceType(var.type());
      // infer the scalar type and device type; it's not expected to infer the layout since these constructors
      // are defined per-layout-type (e.g. tensor vs sparse_coo_tensor).
      const auto& type_inference_type = torch::getVariableType(var.type().scalarType(),
                                                       *torch::getLayout(type.backend()),
                                                       type_inference_device_type);
      const auto& type_to_use = type_inference ? type_inference_type : type;
      return copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) :
                              new_with_type_conversion(type_to_use, var, device_index);
  }

In the bodies of new_with_tensor_copy and new_with_type_conversion you can see this calls copy or toType on Variable, which means you get the normal gradient behavior as if you had called copy() or toType() on them, which is why you incorrectly get requires_grad at the end.

Based on this, this suggests two solutions:

Keep calling copy/toType, but then detach the tensor afterwards, so that you set requires_grad = false and delete the grad_fn. Effectively, you are saying that tensor(t) is equivalent to t.clone().detach(). This is the "do the wrong thing first, and then fixup after yourself" strategy.
Create a new operator on VariableType, new_with_tensor, which does the right thing from the get go and doesn't actually ever create a grad fn or set requires grad on the output. This is the "do the right thing from the beginning" strategy, but it is more complicated to implement.

I think both approaches are reasonable. If you want to do (2), you'll need to write a custom VariableType method implementation, based on the fact that detach is hardcoded, like this:

Tensor VariableType::detach(const Tensor & self) const {
  profiler::RecordFunction profiler("detach");
  torch::jit::Node* node = nullptr;
  if (jit::tracer::isTracing()) {
    auto& graph = jit::tracer::getTracingState()->graph;
    node = graph->create(jit::aten::detach, /*outputs=*/0);
    jit::tracer::recordSourceLocation(node);
    jit::tracer::addInputs(node, "self", self);
    graph->appendNode(node);

  }
  // <NON_GENERATED_CODE>
  auto result = as_variable_ref(const_cast<Tensor&>(self)).detach();
  // </NON_GENERATED_CODE>
  if (jit::tracer::isTracing()) {
    jit::tracer::addOutput(node, result);
  }
  return result;
}

You'll have something similar, except that it calls copy/toType as necessary. But I think detach after the fact is also a very reasonable strategy, and much simpler.

weiyangfb · 2018-09-17T16:06:20Z

@ezyang I am picking the (2) approach here: clone() and detach() when copy construct a new tensor.

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

torch/csrc/utils/tensor_new.cpp

ezyang

I feel there are some semantic problems.

Basically, in my opinion (with concurrences from @gchanan and @apaszke), we think that torch.tensor(x) should be equivalent to x.clone().detach() and torch.tensor(x, requires_grad=True) should be equivalent to x.clone().detach().requires_grad_(True). The basic idea is that torch.tensor reads out "the data" from whatever it is passed, and constructs a leaf variable. This gives us the invariant that "for any argument x, torch.tensor(x) does not require grad, and torch.tensor(x, requires_grad=True) gives a leaf variable that requires grad).

weiyangfb · 2018-09-17T18:54:46Z

@ezyang Thanks a lot for the clarifications! Sorry I missed the default case. I will make changes so that:

source = torch.randn(3, 3, requires_grad=True)
copy = torch.tensor(source) # requires_grad=False

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

torch/csrc/utils/tensor_new.cpp

test/test_torch.py

ezyang

I think it's still not right.

Also, can we update the docs for tensor for this case?

ezyang · 2018-09-17T21:45:16Z

In general, we should discourage people from passing tensor to torch.tensor, in favor of explicitly saying x.clone().detach() or x.clone().detach().requires_grad_(True). @apaszke suggested a warning might be appropriate in this case.

weiyangfb · 2018-09-17T22:26:18Z

@ezyang Thanks a lot for the comments! I will update the PR to make copy construct from a tensor to be leaf variable as always, and set requires_grad wrt to input args. Also raise python warnings inside this path, and update the docs altogether.

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

torch/_torch_docs.py

…or, and set requires_grad wrt to input args, 2. raise warnings for this path and update docs

facebook-github-bot

weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

apaszke

I'm not sure if what we did here is correct.

torch/csrc/utils/tensor_new.cpp

-      return copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) :
+      auto new_tensor = copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) :
                              new_with_type_conversion(type_to_use, var, device_index);
+      new_tensor.detach_(); // making copy constructed tensor a leaf node


torch/_torch_docs.py

+    When data is a tensor `x`, :func:`torch.tensor` reads out 'the data' from whatever it is passed,
+    and constructs a leaf variable. Therefore ``torch.tensor(x)`` is equivalent to ``x.clone().detach()``
+    and ``torch.tensor(x, requires_grad=True)`` is equivalent to ``x.clone().detach().requires_grad_(True)``.
+    The equivalents use ``clone()`` and ``detach()`` are recommended.


torch/csrc/utils/tensor_new.cpp


  if (THPVariable_Check(data)) {
+      PyErr_WarnEx(PyExc_UserWarning,
+        "To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() "


torch/csrc/utils/tensor_new.cpp

-               type_inference)
+               type_inference,
+               args_requires_grad)
        .set_requires_grad(r.toBool(3));


torch/csrc/utils/tensor_new.cpp

-      return copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) :
+      auto new_tensor = copy_variables ? new_with_tensor_copy(type_to_use, var, device_index) :
                              new_with_type_conversion(type_to_use, var, device_index);
+      new_tensor.detach_(); // making copy constructed tensor a leaf node


Summary: - fix PR pytorch#11061 by moving `detach_()` and `set_requires_grad()` to `torch.tensor_ctor()` and `tensor.new_tensor`, and also removed warnings and `args_requires_grad` from `internal_new_from_data ` - with this patch, the returned tensor from `tensor_ctor()` and `new_tensor` will be detached from source tensor, and set requires_grad based on the input args - `torch.as_tensor` retains its behavior as documented gchanan apaszke Pull Request resolved: pytorch#11815 Differential Revision: D9932713 Pulled By: weiyangfb fbshipit-source-id: 4290cbc57bd449954faadc597c24169a7b2d8259

Summary: The earlier tests had around 80 warnings, and now there are 6 warnings: these are due to JIT The changes remove the wrapping of a Tensor by a Tensor constructor, which emits warnings due to the changes in #11061 . Pull Request resolved: #12038 Differential Revision: D10033392 Pulled By: apaszke fbshipit-source-id: b1faf368e650d062d7983f9932511bee4702a893

weiyangfb requested review from apaszke, colesbury, ezyang, gchanan, soumith and zdevito as code owners August 30, 2018 02:55

weiyangfb changed the title ~~mute grad when copy construct from a var with requires_grad=Fales~~ mute grad when copy construct from a var with requires_grad=False Aug 30, 2018

facebook-github-bot reviewed Aug 30, 2018

View reviewed changes

weiyangfb changed the title ~~mute grad when copy construct from a var with requires_grad=False~~ [Easy]mute grad when copy construct from a var with requires_grad=False Sep 11, 2018

weiyangfb added the ready for review (this tag is deprecated) All PRs are ready for review unless they are draft, WIP, or have undismissed requested changes label Sep 11, 2018

ezyang reviewed Sep 12, 2018

View reviewed changes

torch/csrc/utils/tensor_new.cpp Outdated

This comment was marked as off-topic.

Sign in to view

ezyang requested changes Sep 12, 2018

View reviewed changes

zou3519 reviewed Sep 12, 2018

View reviewed changes

torch/csrc/utils/tensor_new.cpp Outdated

This comment was marked as off-topic.

Sign in to view

This comment was marked as off-topic.

Sign in to view

weiyangfb force-pushed the create_tensor_requires_grad_false branch from 68e72fd to 04b32b3 Compare September 16, 2018 06:55

weiyangfb changed the title ~~[Easy]mute grad when copy construct from a var with requires_grad=False~~ [Easy]detach new_tensor when copy construct from a tensor (requires_grad=True) while setting requires_grad=False Sep 16, 2018

facebook-github-bot reviewed Sep 17, 2018

View reviewed changes

ezyang reviewed Sep 17, 2018

View reviewed changes

torch/csrc/utils/tensor_new.cpp Outdated

This comment was marked as off-topic.

Sign in to view

ezyang requested changes Sep 17, 2018

View reviewed changes

weiyangfb force-pushed the create_tensor_requires_grad_false branch from 04b32b3 to c040b6b Compare September 17, 2018 19:22

facebook-github-bot reviewed Sep 17, 2018

View reviewed changes

ezyang reviewed Sep 17, 2018

View reviewed changes

torch/csrc/utils/tensor_new.cpp Outdated

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Sep 17, 2018

View reviewed changes

torch/csrc/utils/tensor_new.cpp Outdated

This comment was marked as off-topic.

Sign in to view

ezyang reviewed Sep 17, 2018

View reviewed changes

test/test_torch.py Outdated

This comment was marked as off-topic.

Sign in to view

ezyang requested changes Sep 17, 2018

View reviewed changes

weiyangfb force-pushed the create_tensor_requires_grad_false branch from c040b6b to 02f1b22 Compare September 17, 2018 22:29

weiyangfb changed the title ~~[Easy]detach new_tensor when copy construct from a tensor (requires_grad=True) while setting requires_grad=False~~ [Easy] make copy constructed tensor a leaf variable in using torch.tensor(sourceTensor) Sep 17, 2018

weiyangfb changed the title ~~[Easy] make copy constructed tensor a leaf variable in using torch.tensor(sourceTensor)~~ [Easy] make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) Sep 17, 2018

weiyangfb force-pushed the create_tensor_requires_grad_false branch from 02f1b22 to 6889608 Compare September 17, 2018 22:46

facebook-github-bot reviewed Sep 18, 2018

View reviewed changes

ezyang reviewed Sep 18, 2018

View reviewed changes

torch/_torch_docs.py Outdated

This comment was marked as off-topic.

Sign in to view

ezyang approved these changes Sep 18, 2018

View reviewed changes

1. make copy constructed tensor a leaf variable when copy from a tens…

9c413bb

…or, and set requires_grad wrt to input args, 2. raise warnings for this path and update docs

weiyangfb force-pushed the create_tensor_requires_grad_false branch from 6889608 to 9c413bb Compare September 18, 2018 04:47

facebook-github-bot reviewed Sep 18, 2018

View reviewed changes

facebook-github-bot closed this in 407a9fe Sep 18, 2018

apaszke reviewed Sep 18, 2018

View reviewed changes

gchanan reviewed Sep 18, 2018

View reviewed changes

weiyangfb mentioned this pull request Sep 18, 2018

detach returned tensors from source tensors in tensor constructors #11815

Closed

vishwakftw mentioned this pull request Sep 25, 2018

Fix warnings emitted when testing distributions #12038

Closed

ezyang added the merged label Jun 26, 2019

[Easy] make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) #11061

[Easy] make copy constructed tensor a leaf variable when using torch.tensor(sourceTensor) #11061

Uh oh!

Conversation

weiyangfb commented Aug 30, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

weiyangfb commented Sep 11, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

apaszke commented Sep 12, 2018

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

weiyangfb commented Sep 12, 2018

Uh oh!

ezyang commented Sep 14, 2018

Uh oh!

weiyangfb commented Sep 17, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

weiyangfb commented Sep 17, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Sep 17, 2018

Uh oh!

weiyangfb commented Sep 17, 2018

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

apaszke left a comment

Choose a reason for hiding this comment

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

Uh oh!

This comment was marked as off-topic.

weiyangfb commented Aug 30, 2018 •

edited

Loading

ezyang left a comment •

edited

Loading