-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[autograd][docs] Add more details on why save_for_backward is important in extending autograd note #153005
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…nt in extending autograd note [ghstack-poisoned]
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/153005
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 1 Unrelated FailureAs of commit 73394bc with merge base e2c7ae5 ( NEW FAILURE - The following job has failed:
FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
… is important in extending autograd note" [ghstack-poisoned]
docs/source/notes/extending.rst
Outdated
| Saving tensors via ``save_for_backward`` allows the autograd engine to clear | ||
| them as soon as the backward computation of the ``autograd.Function`` completes. | ||
| (If a tensor is stored directly on ``ctx`` | ||
| it will unnecessarily remain alive for the lifetime of the autograd graph -- | ||
| typically until the end of the iteration.) Using ``save_for_backward`` | ||
| also helps avoid certain reference cycles, (e.g., since the tensor | ||
| output of the ``autograd.Function`` itself keeps a reference to the ctx). | ||
| Saving via ``save_for_backward`` is also important for compatibility with | ||
| features like activation checkpointing and offloading that rely on | ||
| :class:`torch.autograd.graph.saved_tensors_hooks`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A possible readability improvement - not sure if I formatted it correctly - been a while since I touched rst.
| Saving tensors via ``save_for_backward`` allows the autograd engine to clear | |
| them as soon as the backward computation of the ``autograd.Function`` completes. | |
| (If a tensor is stored directly on ``ctx`` | |
| it will unnecessarily remain alive for the lifetime of the autograd graph -- | |
| typically until the end of the iteration.) Using ``save_for_backward`` | |
| also helps avoid certain reference cycles, (e.g., since the tensor | |
| output of the ``autograd.Function`` itself keeps a reference to the ctx). | |
| Saving via ``save_for_backward`` is also important for compatibility with | |
| features like activation checkpointing and offloading that rely on | |
| :class:`torch.autograd.graph.saved_tensors_hooks`. | |
| Saving tensors via ``save_for_backward``: | |
| 1. allows the autograd engine to clear | |
| them as soon as the backward computation of the ``autograd.Function`` completes. | |
| (If a tensor is stored directly on ``ctx`` | |
| it will unnecessarily remain alive for the lifetime of the autograd graph -- | |
| typically until the end of the iteration.) | |
| 2. it helps avoid certain reference cycles, (e.g., since the tensor | |
| output of the ``autograd.Function`` itself keeps a reference to the ctx). | |
| 3. it is also important for compatibility with | |
| features like activation checkpointing and offloading that rely on | |
| :class:`torch.autograd.graph.saved_tensors_hooks`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, updated. Will double check the rendered results
… is important in extending autograd note" cc stas00 [ghstack-poisoned]
albanD
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / win-vs2022-cpu-py3 / test (default, 1, 3, ephemeral.windows.4xlarge.nonephemeral) Details for Dev Infra teamRaised by workflow job |
|
@pytorchbot merge -i |
Merge startedYour change will be merged while ignoring the following 2 checks: trunk / macos-py3-arm64 / test (default, 3, 3, macos-m1-stable), trunk / win-vs2022-cpu-py3 / test (default, 1, 3, ephemeral.windows.4xlarge.nonephemeral) Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
Starting merge as part of PR stack under #153094 |
Fixes #152773 Pull Request resolved: #153094 Approved by: https://github.com/albanD ghstack dependencies: #153005
Stack from ghstack (oldest at bottom):
cc @stas00