Split nn.Module._save_to_state_dict to make it overridable #21933

dzhulgakov · 2019-06-19T00:23:22Z

Motivation

We allow to override JIT module serialization with __getstate__/__setstate__ in order to cover cases where parameters are not serializable. Use cases include: MKLDNN integration:

pytorch/torch/utils/mkldnn.py

Lines 18 to 26 in a388c78

    
           @torch.jit.script_method 
        
           def __getstate__(self): 
        
               return (self.weight.to_dense(), self.bias.to_dense()) 
        
           @torch.jit.script_method 
        
           def __setstate__(self, state): 
        
               # type: (Tuple[Tensor, Tensor]) -> None 
        
               self.weight = state[0].to_mkldnn() 
        
               self.bias = state[1].to_mkldnn()

and also fbgemm prepacked format integration for quantized tensors.

However many Eager scripts use torch.save(module.state_dict()) form of serialization. There are several ways to make it work:

make packed_weight itself pickleable (e.g. by binding __getstate__/__setstate__ on C++ UDT level)
- change: we’d need to allow module buffers to be of arbitrary, non-Tensor types
- pro: no change to state_dict behavior
- cons: might not be directly inspectable by user calling .state_dict(), especially if packed weights represent several tensors fused together
make packed_weight being proper Tensor layout
- pro: no change to state_dict or buffers behavior
- cons: adding new tensor layouts is pretty costly today
- cons: doesn’t work if multiple tensors are packed in one interleaved representation
[this approach] allow Modules to override state_dict and return regular tensors
- pro: most flexible and hackable
- pro: maintains semantic meaning of statedict as all data necessary to represent module’s state
- cons: complicates state_dict logic
- cons: potential code duplication between __getstate__/__setstate__

Based on discussions with @zdevito and @gchanan we decided to pick latter approach. Rationale: this behavior is fully opt-in and will impact only modules that need it. For those modules the requirement listed above won't be true. But we do preserve requirement that all elements of state_dict are tensors. (https://fburl.com/qgybrug4 for internal discussion)

In the future we might also implement one of the approaches above but those are more involved.

soumith

from what I can tell, this doesn't do what you want it to do.

If you nest m inside a Sequential for example, it wont call m's custom state dict function

ezyang · 2019-06-19T13:36:55Z

from what I can tell, this doesn't do what you want it to do.

If you nest m inside a Sequential for example, it wont call m's custom state dict function

Err, I don't think I agree? We make recursive calls to the nested modules state_dict which will the handle the dispatch.

ezyang · 2019-06-19T13:37:36Z

torch/nn/modules/module.py

It's probably better to give specific example of when using this method is appropriate.

ezyang · 2019-06-19T13:44:00Z

test/test_nn.py

Though I suppose you can appease soumith by testing the nested case explicitly here ;)

ezyang · 2019-06-19T13:44:59Z

test/test_nn.py

Do we really want to be overriding this method directly?

It seems to be symmetric to the _save_to_state_dict (mostly, except hooks). Better suggestions?

ezyang

Seems reasonable? I do wonder about _load_from_state_dict though

facebook-github-bot

@dzhulgakov is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

facebook-github-bot · 2019-06-21T19:09:17Z

@dzhulgakov merged this pull request in 82dd693.

…1933) Summary: # Motivation We allow to override JIT module serialization with `__getstate__/__setstate__` in order to cover cases where parameters are not serializable. Use cases include: MKLDNN integration: https://github.com/pytorch/pytorch/blob/a388c783505987363717bd4da0b166e8d1d7ccb9/torch/utils/mkldnn.py#L18-L26 and also fbgemm prepacked format integration for quantized tensors. However many Eager scripts use `torch.save(module.state_dict())` form of serialization. There are several ways to make it work: * make packed_weight itself pickleable (e.g. by binding `__getstate__/__setstate__` on C++ UDT level) * change: we’d need to allow module buffers to be of arbitrary, non-Tensor types * pro: no change to state_dict behavior * cons: might not be directly inspectable by user calling .state_dict(), especially if packed weights represent several tensors fused together * make packed_weight being proper Tensor layout * pro: no change to state_dict or buffers behavior * cons: adding new tensor layouts is pretty costly today * cons: doesn’t work if multiple tensors are packed in one interleaved representation * *[this approach]* allow Modules to override state_dict and return regular tensors * pro: most flexible and hackable * pro: maintains semantic meaning of statedict as all data necessary to represent module’s state * cons: complicates state_dict logic * cons: potential code duplication between `__getstate__/__setstate__` Based on discussions with zdevito and gchanan we decided to pick latter approach. Rationale: this behavior is fully opt-in and will impact only modules that need it. For those modules the requirement listed above won't be true. But we do preserve requirement that all elements of state_dict are tensors. (https://fburl.com/qgybrug4 for internal discussion) In the future we might also implement one of the approaches above but those are more involved. Pull Request resolved: pytorch#21933 Differential Revision: D15937678 Pulled By: dzhulgakov fbshipit-source-id: 3cb5d1a8304d04def7aabc0969d0a2e7be182367

dzhulgakov requested review from ezyang, gchanan and raghuramank100 June 19, 2019 00:23

pytorchbot added the module: nn Related to torch.nn label Jun 19, 2019

soumith reviewed Jun 19, 2019

View reviewed changes

ezyang reviewed Jun 19, 2019

View reviewed changes

torch/nn/modules/module.py Outdated

Copy link

Contributor

ezyang Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably better to give specific example of when using this method is appropriate.

ezyang reviewed Jun 19, 2019

View reviewed changes

test/test_nn.py Outdated

Copy link

Contributor

ezyang Jun 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I suppose you can appease soumith by testing the nested case explicitly here ;)

ezyang reviewed Jun 19, 2019

View reviewed changes

ezyang approved these changes Jun 19, 2019

View reviewed changes

dzhulgakov mentioned this pull request Jun 20, 2019

[quant] Add serialization for nn.quantized.Linear module #21925

Closed

pytorchbot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Jun 20, 2019

Dmytro Dzhulgakov added 3 commits June 20, 2019 13:57

Split nn.Module._save_to_state_dict to make it overridable

9a92c24

fix breakage and add nested test

da0224d

lint

4e94499

dzhulgakov force-pushed the state-dict branch from 4c79049 to 4e94499 Compare June 20, 2019 20:57

facebook-github-bot reviewed Jun 21, 2019

View reviewed changes

apaszke approved these changes Jun 21, 2019

View reviewed changes

facebook-github-bot closed this in 82dd693 Jun 21, 2019

facebook-github-bot added the merged label Jun 21, 2019

mruberry added the Merged label Oct 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Split nn.Module._save_to_state_dict to make it overridable #21933

Split nn.Module._save_to_state_dict to make it overridable #21933

Uh oh!

dzhulgakov commented Jun 19, 2019

Uh oh!

soumith left a comment

Uh oh!

ezyang commented Jun 19, 2019

Uh oh!

ezyang Jun 19, 2019

Uh oh!

ezyang Jun 19, 2019

Uh oh!

ezyang Jun 19, 2019

Uh oh!

dzhulgakov Jun 20, 2019

Uh oh!

ezyang left a comment

Uh oh!

facebook-github-bot left a comment

Uh oh!

facebook-github-bot commented Jun 21, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

	@torch.jit.script_method
	def __getstate__(self):
	return (self.weight.to_dense(), self.bias.to_dense())

	@torch.jit.script_method
	def __setstate__(self, state):
	# type: (Tuple[Tensor, Tensor]) -> None
	self.weight = state[0].to_mkldnn()
	self.bias = state[1].to_mkldnn()

Split nn.Module._save_to_state_dict to make it overridable #21933

Split nn.Module._save_to_state_dict to make it overridable #21933

Uh oh!

Conversation

dzhulgakov commented Jun 19, 2019

Motivation

Uh oh!

soumith left a comment

Choose a reason for hiding this comment

Uh oh!

ezyang commented Jun 19, 2019

Uh oh!

ezyang Jun 19, 2019

Choose a reason for hiding this comment

Uh oh!

ezyang Jun 19, 2019

Choose a reason for hiding this comment

Uh oh!

ezyang Jun 19, 2019

Choose a reason for hiding this comment

Uh oh!

dzhulgakov Jun 20, 2019

Choose a reason for hiding this comment

Uh oh!

ezyang left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot left a comment

Choose a reason for hiding this comment

Uh oh!

facebook-github-bot commented Jun 21, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants