Skip to content

Conversation

@colesbury
Copy link
Member

load_state_dict includes a recursive inner function load that captures
Tensors through the close-over variable state_dict. Because it's
recursive, it also captures itself leading to a reference cycle.

This breaks the reference cycle so that any Tensors in state_dict can be
collected immediately instead of waiting until the next GC cycle.

Alternatively, we could have passed state_dict and metadata as
arguments to load to prevent capture of Tensors. (That would still
result in cyclic garbage, but not any cyclic garbage of Tensors).

See:
#20199 (comment)

load_state_dict includes a recursive inner function `load` that captures
Tensors through the close-over variable `state_dict`. Because it's
recursive, it also captures itself leading to a reference cycle.

This breaks the reference cycle so that any Tensors in state_dict can be
collected immediately instead of waiting until the next GC cycle.

Alternatively, we could have passed `state_dict` and `metadata` as
arguments to load to prevent capture of Tensors. (That would still
result in cyclic garbage, but not any cyclic garbage of Tensors).

See:
pytorch#20199 (comment)
@pytorchbot pytorchbot added the module: nn Related to torch.nn label May 11, 2019
@Stonesjtu
Copy link
Contributor

Can you plz explain would there be cyclic garbage even when passing all the variables as arguments?

@colesbury
Copy link
Member Author

Can you plz explain would there be cyclic garbage even when passing all the variables as arguments?

@Stonesjtu because the function load still captures a reference to itself. You'd have to also pass load as an argument to load which is doable but looks weird.

You can see this here: foo.py vs foo2.py.

Anyways, cyclic-garbage isn't a problem in general unless it involves a Tensor.

@Stonesjtu
Copy link
Contributor

@colesbury , thanks! Should we add documentation to advise users against using Tensors as closure in recursive function?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colesbury has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@colesbury merged this pull request in c1fa449.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Merged module: nn Related to torch.nn

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants