Remove tensor_data() call sites in torch/csrc/autograd and torch/csrc/cuda #22172

yf225 · 2019-06-25T00:04:46Z

As part of the Variable/Tensor merge, variable.tensor_data() should be removed in favor of variable.variable_data() (which has the same semantics as Python tensor.data). This PR removes tensor_data() call sites in torch/csrc/autograd / torch/csrc/cuda.

This reverts commit 6352843.

…_tensor_data_callsites

gchanan · 2019-07-01T18:08:20Z

torch/csrc/autograd/saved_variable.cpp

  // they still share the same storage. This works only because we never call
  // in-place functions on unpacked variables.
-  Variable var;
+  Variable var = as_variable_ref(data_).variable_data();


didn't you already set this to variable_data above? So you can just skip this?

It seems unclear whether we have invariant that the SavedVariable will only be unpacked once. (I tried removing detach/variable_data here and all tests pass.) If we do have that invariant, we should always use detach/variable_data here, so that the unpacked Variable is always a shallow copy of the SavedVariable.

gchanan · 2019-07-01T20:04:29Z

torch/csrc/autograd/python_variable.cpp

-  auto data = as_variable_ref(r.tensor(1)).tensor_data();
-  auto var = make_variable(data, r.toBool(2));
+  auto data = as_variable_ref(r.tensor(1)).variable_data();
+  data.unsafeGetTensorImpl()->set_allow_tensor_metadata_change(true);


Two points:

Is there any reason this can't use re-use the version counter, i.e. detach semantics rather than data?

If we are going with detach semantics, passing a boolean to allow the metadata change seems reasonable -- have we not done a release yet where the "allow_metadata_change" is in?

torch/csrc/autograd/saved_variable.cpp

torch/csrc/autograd/functions/basic_ops.cpp

gchanan · 2019-07-01T20:21:37Z

torch/csrc/autograd/functions/utils.cpp

 namespace torch { namespace autograd {

-variable_list wrap_outputs(const variable_list& inputs, tensor_list&& outputs,
+variable_list wrap_outputs(const variable_list& inputs, variable_list&& outputs,


I don't understand what's going on here, can you explain?

This is only called for legacy_apply functions? These should always be used with variable_data for some reason?

This is called for legacy_apply functions and also DelayedError::apply. The main reason we need to change this from tensor_list to variable_list is because outputs are Variables now instead of Tensors (because we now use .detach() instead of .tensor_data() to build up the outputs list in legacy_apply and DelayedError::apply).

I get that we had tensors and now we have variables, I'm trying to understand what the constraints of this code actually are. So this is used with legacy apply, which we want to get rid of, right? (#16947). How does legacy apply actually work? The forwards is wrapped in a no-grad block?

For legacy autograd functions, the forward is wrapped in a no-grad block

pytorch/torch/csrc/autograd/python_function.cpp

Lines 646 to 652 in 0c09138

{

AutoGradMode grad_mode(false);

THPObjectPtr forward_fn(PyObject_GetAttrString((PyObject*)self, "forward"));

if (!forward_fn) return nullptr;

raw_output = PyObject_CallObject(forward_fn, unpacked_input.input_tuple);

if (!raw_output) return nullptr;

}

legacy_apply is only used in legacy autograd function's backward pass

pytorch/torch/csrc/autograd/python_function.cpp

Lines 107 to 118 in 0c09138

// NOTE: this function is written in a way that assumes it's only called for backward;

// it's used by engine.cpp. This is responsible for forwarding a call from

// C++'s Function::apply to a Python method "apply".

auto PyFunction::apply(variable_list&& inputs) -> variable_list {

AutoGIL gil;

at::OptionalDeviceGuard _device_guard;

THPFunction* py_fn = (THPFunction*)obj;

THPObjectPtr _legacy(PyObject_GetAttrString(obj, "_is_legacy"));

if (_legacy == Py_True) {

return legacy_apply(inputs);

}

In legacy_apply, we have to make a shallow-copy of the output from the backward function (via detach or variable_data) before calling wrap_outputs, because for a backward function that directly pass grad_x through:

class PassthroughFunction(Function): ... def backward(self, grad_x): return grad_x

if grad_x requires grad, we will attach a new gradient edge to grad_x (via autograd::create_gradient_edge(output, grad_fn)), which will replace any previous gradient edge that grad_x has (which is an incorrect behavior). Hence we should make a shallow-copy of the output from the backward function in legacy_apply before calling wrap_outputs.

yf225 · 2019-07-01T23:03:17Z

torch/csrc/autograd/saved_variable.cpp

  // they still share the same storage. This works only because we never call
  // in-place functions on unpacked variables.
-  Variable var;
+  Variable var = as_variable_ref(data_).detach();


It seems unclear whether we have invariant that the SavedVariable will only be unpacked once. (I tried removing detach/variable_data here and all tests pass.) If we do have that invariant, we should always use detach/variable_data here.

how would you figure out if we have that invariant or not?

In the following use case, a SavedVariable can be unpacked more than once:

>>> a = torch.zeros(2, 2).fill_(3).requires_grad_() >>> b = torch.zeros(2, 2).fill_(5).requires_grad_() >>> c = a*b # SavedVariable being saved: 3 3 # 3 3 # [ CPUFloatType{2,2} ] # SavedVariable being saved: 5 5 # 5 5 # [ CPUFloatType{2,2} ] >>> d = c.sum() >>> d.backward(retain_graph=True) # SavedVariable being unpacked: 3 3 # 3 3 # [ CPUFloatType{2,2} ] # SavedVariable being unpacked: 5 5 # 5 5 # [ CPUFloatType{2,2} ] >>> d.backward() # SavedVariable being unpacked: 3 3 # 3 3 # [ CPUFloatType{2,2} ] # SavedVariable being unpacked: 5 5 # 5 5 # [ CPUFloatType{2,2} ]

In the above example, a and b are unpacked more than once, hence we should always use detach/variable_data to make a shallow-copy of the SavedVariable here, so that each unpacking does not affect the original SavedVariable.

yf225 · 2019-07-05T19:07:18Z

torch/csrc/cuda/comm.cpp

          AT_ASSERT(t.is_variable());
-          Variable var = t;
-          device_outputs.push_back(make_variable(var.tensor_data(), false));
+          Variable var = as_variable_ref(t).variable_data();


We specifically want to use variable_data instead of detach here, because the comment above this function in NOTE [ Version Counter in comm.*_coalesced ] mentions:

// We thus re-wrap these Variables after broadcasting (i.e., effetively doing // what is equivalent to .data in Python), and give them individual version // counters. ...

variable_data is the strict equivalent of .data in Python, which creates new version counter for the returned tensor, and is exactly what we want.

yf225 · 2019-07-05T19:07:27Z

torch/csrc/cuda/comm.cpp

          AT_ASSERT(t.is_variable());
-          Variable var = t;
-          device_outputs.push_back(make_variable(var.tensor_data(), false));
+          Variable var = as_variable_ref(t).variable_data();


Please see comment above.

gchanan

The python_legacy_variable and python_variable changes look fine (and you can break them out separately if you want), but they should have clear BC commit messages.

I still don't really understand what is going on with the legacy apply or comm primitive stuff.

yf225 · 2019-07-12T20:54:50Z

I moved python_legacy_variable and python_variable changes to another PR: #22821.

…_tensor_data_callsites

pytorchbot · 2022-04-12T02:34:53Z

Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as Stale.
Feel free to remove the Stale label if you feel this was a mistake.
Stale pull requests will automatically be closed 30 days after being marked Stale

Will Feng added 11 commits June 24, 2019 16:32

rename tensor_data() to _tensor_data_deprecated()

8ab5c4d

[WIP]

ff9b79d

fix wrap_outputs

a2ad8e6

DEBUG

f853d31

try to fix test_autograd

8431b65

DEBUG

6352843

Revert "DEBUG"

0e8d092

This reverts commit 6352843.

DEBUG

33cb485

DEBUG

c163bdc

DEBUG

f5d6b0d

better comments

dd670ff

yf225 requested review from apaszke, mrshenli and pietern as code owners June 25, 2019 00:04

yf225 removed request for apaszke, mrshenli and pietern June 25, 2019 00:28

Will Feng added 2 commits June 24, 2019 21:48

DEBUG

53d8a33

fix

14bfbd7

yf225 changed the title ~~[WIP] Remove tensor_data() call sites, and rename it to _tensor_data_deprecated()~~ Remove tensor_data() call sites, and rename it to _tensor_data_deprecated() Jun 25, 2019

yf225 force-pushed the remove_tensor_data_callsites branch from e2e6312 to 14bfbd7 Compare June 25, 2019 01:58

yf225 changed the title ~~Remove tensor_data() call sites, and rename it to _tensor_data_deprecated()~~ Remove some of tensor_data() call sites Jun 25, 2019

yf225 changed the title ~~Remove some of tensor_data() call sites~~ Remove tensor_data() call sites in torch/csrc/autograd/ and distributed Jun 25, 2019

yf225 changed the title ~~Remove tensor_data() call sites in torch/csrc/autograd/ and distributed~~ Remove tensor_data() call sites in torch/csrc/autograd and distributed Jun 25, 2019

yf225 mentioned this pull request Jun 25, 2019

Variable/Tensor Merge Proposal #13638

Closed

22 tasks

Merge branch 'master' of https://github.com/yf225/pytorch into remove…

60f5bbe

…_tensor_data_callsites

yf225 changed the title ~~Remove tensor_data() call sites in torch/csrc/autograd and distributed~~ Remove tensor_data() call sites in torch/csrc/autograd and torch/csrc/cuda Jun 25, 2019

Merge branch 'master' into remove_tensor_data_callsites

b9d7d72

gchanan requested changes Jul 1, 2019

View reviewed changes

use detach() in saved_variable

1b5a61f

yf225 commented Jul 1, 2019

View reviewed changes

yf225 mentioned this pull request Jul 2, 2019

Add optional allow_tensor_metadata_change flag to detach() #22419

Closed

Will Feng added 4 commits July 3, 2019 15:24

use detach

7c07876

improve

c5fe6c0

update

8e6b848

fix broadcast_coalesced

f41d07a

yf225 commented Jul 5, 2019

View reviewed changes

gchanan reviewed Jul 10, 2019

View reviewed changes

move changes to another PR

ee3780f

Merge branch 'master' of https://github.com/yf225/pytorch into remove…

ad92e15

…_tensor_data_callsites

yf225 mentioned this pull request Jul 16, 2019

[BC-breaking] Remove legacy autograd function #22762

Closed

Will Feng added 3 commits July 18, 2019 15:59

better comments

20210a0

better comments

4fce637

Merge branch 'master' of https://github.com/yf225/pytorch into remove…

6c4f648

…_tensor_data_callsites

yf225 mentioned this pull request Oct 23, 2019

Merge Tensor and Variable types. #28287

Closed

facebook-github-bot added the cla signed label Oct 30, 2020

pytorchbot added the Stale label Apr 12, 2022

github-actions bot closed this May 12, 2022

	{
	AutoGradMode grad_mode(false);
	THPObjectPtr forward_fn(PyObject_GetAttrString((PyObject*)self, "forward"));
	if (!forward_fn) return nullptr;
	raw_output = PyObject_CallObject(forward_fn, unpacked_input.input_tuple);
	if (!raw_output) return nullptr;
	}

	// NOTE: this function is written in a way that assumes it's only called for backward;
	// it's used by engine.cpp. This is responsible for forwarding a call from
	// C++'s Function::apply to a Python method "apply".
	auto PyFunction::apply(variable_list&& inputs) -> variable_list {
	AutoGIL gil;
	at::OptionalDeviceGuard _device_guard;
	THPFunction* py_fn = (THPFunction*)obj;

	THPObjectPtr _legacy(PyObject_GetAttrString(obj, "_is_legacy"));
	if (_legacy == Py_True) {
	return legacy_apply(inputs);
	}

Remove tensor_data() call sites in torch/csrc/autograd and torch/csrc/cuda #22172

Remove tensor_data() call sites in torch/csrc/autograd and torch/csrc/cuda #22172

Uh oh!

Conversation

yf225 commented Jun 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yf225 Jul 1, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gchanan left a comment

Choose a reason for hiding this comment

Uh oh!

yf225 commented Jul 12, 2019

Uh oh!

pytorchbot commented Apr 12, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yf225 commented Jun 25, 2019 •

edited

Loading

yf225 Jul 1, 2019 •

edited

Loading