Backward function will set a flag if it released variables #21533

ifedan · 2019-06-07T18:56:49Z

This is a fix for #21469
Currently there is no way to define if backward function released variables when variables were added to a vector. This change will set a flag if function has saved variables and they were released. So we will prevent if somebody will call this function again with already released variables.
Functions that do not have saved variables can be called multiple times for BC

…id second call to it

tools/autograd/templates/Functions.cpp

tools/autograd/gen_autograd_functions.py

torch/csrc/autograd/saved_variable.cpp

zou3519

lgtm

zou3519 · 2019-06-13T16:26:24Z

test/test_autograd.py

+    def test_backward_twice_with_saved_values(self):
+        b = torch.randn(3, requires_grad=True, dtype=torch.double)
+        c = torch.zeros(3, dtype=torch.double)
+        c[[1, 2]] = b[[1, 1]]


It's not completely obvious that this indexing expression causes a TensorList to be saved for backwards and that is what we're fixing in this PR (that backward twice with a saved TensorList works as expected). I don't see any other functions in derivatives.yaml that saves a TensorList, though, so I don't have ideas on how to make this test better.

zou3519 · 2019-06-13T16:27:44Z

Oh, one more thing: you should delete

pytorch/torch/csrc/autograd/saved_variable.cpp

Lines 100 to 103 in 556af7c

    
           const char* ERR_BACKWARD_TWICE = 
        
               "Trying to backward through the graph a second time, but the buffers have " 
        
               "already been freed. Specify retain_graph=True when calling backward " 
        
               "the first time.";

since it's now unused

Edit: Nevermind, you already did a deduplication.

gchanan · 2019-06-13T16:49:16Z

tools/autograd/gen_autograd_functions.py

  ${will_release_variables}
  ${saved_variables}
  ${saved_list_sizes}
+  bool released = false;


naming this released doesn't seem like the best idea because this will cause a problem with any function that tries to capture an argument named released.

Also, can we conditionally generate this variable, like we do with every other variable in autograd Functions?

I also think there are 2 more subtle issues with this:

why is this global on the function? This would prevent us from conditionally unpacking the variables we need if we only need to compute some subset of the backwards, for no good reason. We could just track this per TensorList, like we do with the size.

Speaking of the size, doesn't that solve your disambiguate problem? Like, you know the size at the start and can check the later size, to figure out if something was reset. I guess there is a judgement call on whether you should error if the TensorList that was captured was empty, but that seems like we can just do the same thing we do with Tensors (what happens if we capture an undefined tensor?).

You can unpack some of the variable, flag would be set only after release_variables() will be called.

We do not hold TensorList, instead we hold vector

Name has been changed to variables_were_released_

are your responses "1." and "2." responses to my questions "1." and "2."? I don't understand how they address the questions if so. For example, with 2., I understand that you don't literally hold the "TensorList"; the question is about distinguishing between whether the thing you saved was empty to start or it was wasn't, but was released.

gchanan · 2019-06-13T16:51:37Z

tools/autograd/gen_autograd_functions.py

        elif arg['type'] == 'TensorList':
            saved_variables.append('std::vector<SavedVariable> {}_;'.format(name))
+            release_variables.append('for (auto& sv : {}_)'.format(name))
+            release_variables.append('{')


I don't really understand the strategy here. Why are we both:
(1) resetting the data / resetting the grad function and
(2) tracking released on TensorLists and clearing the vector?

Why does both need to happen?

Here is explanation from @zou3519: #21533 (comment)

Sorry, I was talking about something else in that comment (brainstorming a different solution that would not use a released flag).

@ifedan Just clear() (2) is sufficient, I don't think you need to reset the data and grad function (1). clear() remove the SavedVariables in the list. Because the SavedVariable owns a tensor and a grad_fn, removing the SavedVariable makes them go away as well.

zou3519

Requesting changes based on @gchanan's review

facebook-github-bot

@ifedan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

gchanan

A comment about why this was implemented this way, as opposed to say looping through and releasing each one would be nice.

facebook-github-bot

@ifedan has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-06-18T17:36:55Z

@ifedan merged this pull request in 0998a32.

apaszke

~~Why are we doing this per-arg instead of having a single assert at the beginning of the function?? That way it would have been much faster and simpler.~~

Never mind, this only applies to tensor list arguments so it shouldn't matter.

wanchaol · 2020-02-10T01:48:40Z

~~Why are we doing this per-arg instead of having a single assert at the beginning of the function?? That way it would have been much faster and simpler.~~

Never mind, this only applies to tensor list arguments so it shouldn't matter.

I got a similar question here, given that we have very few autograd Node/Function that is not holding tensors, why we don't just put the flag in the autograd Node/Function itself (SavedVariable also have a flag called was_default_constructed that is used for this)? this way it could be simpler and we don't need to worry about the TensorList and SavedVariable difference for releasing.

gchanan · 2020-02-10T17:07:54Z

@wanchaol: I think that would work, but you'd need to handle the case where you don't hold tensors (it's important you don't give an error in those cases). I'd lean toward code that always works over adding special cases (unless there is a good reason, like perf).

Backward function will set a flag if it was released variables to avo…

8ba4ce8

…id second call to it

pytorchbot added module: autograd Related to torch.autograd, and the autograd engine in general module: internals Related to internal abstractions in c10 and ATen labels Jun 7, 2019

ifedan added 2 commits June 7, 2019 17:01

Flake8 fix

119da75

Fix Flake8 issues

65c17a3

ifedan requested a review from gchanan June 11, 2019 17:46

zou3519 reviewed Jun 11, 2019

View reviewed changes

tools/autograd/templates/Functions.cpp Outdated Show resolved Hide resolved

zou3519 reviewed Jun 11, 2019

View reviewed changes

tools/autograd/gen_autograd_functions.py Show resolved Hide resolved

Changed based on CR

e86b768

zou3519 reviewed Jun 13, 2019

View reviewed changes

torch/csrc/autograd/saved_variable.cpp Outdated Show resolved Hide resolved

ifedan mentioned this pull request Jun 13, 2019

RuntimeError when trying to reuse a buffer #21469

Closed

Changed based on CR

240d724

ifedan requested a review from zou3519 June 13, 2019 16:09

zou3519 approved these changes Jun 13, 2019

View reviewed changes

gchanan reviewed Jun 13, 2019

View reviewed changes

zou3519 requested changes Jun 13, 2019

View reviewed changes

Changes based on CR

fd60f2d

ifedan requested review from gchanan and zou3519 June 13, 2019 18:40

Changes based on CR

3dbcf3a

facebook-github-bot reviewed Jun 13, 2019

View reviewed changes

Changes based on CR

9b12aed

gchanan approved these changes Jun 17, 2019

View reviewed changes

Added comments

c338723

facebook-github-bot reviewed Jun 18, 2019

View reviewed changes

facebook-github-bot closed this in 0998a32 Jun 18, 2019

facebook-github-bot added the merged label Jun 18, 2019

apaszke reviewed Jun 22, 2019

View reviewed changes

mruberry added the Merged label Oct 28, 2020

Backward function will set a flag if it released variables #21533

Backward function will set a flag if it released variables #21533

Uh oh!

Conversation

ifedan commented Jun 7, 2019

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 commented Jun 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zou3519 left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

gchanan left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jun 18, 2019

Uh oh!

apaszke left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wanchaol commented Feb 10, 2020

Uh oh!

gchanan commented Feb 10, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

zou3519 commented Jun 13, 2019 •

edited

Loading

apaszke left a comment •

edited

Loading