-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
🚀 The feature, motivation and pitch
This is useful for debugging complex tree-structured models to be able to .name and understand where a module is found within the whole model tree
This idea is currently used only for parameters within state_dict() serialization. I suggest that enabling .name attribute/property would be useful for debugging and various formatting/debug-printing in the general context. Such .name could be used in __str__ implementations of modules and tensors, could be used for creating easier-visualizable ONNX graphs. Access to module name might also be useful for various finer-grained logic hacks in module hooks (although maybe not the best coding practice in all cases - but for hacks might be okay).
I propose to:
- introduce a
.nameproperty (backed by._nameattribute which may not exist if we wrongly torch.load and old model file) or just a.nameattribute if the deserializing-old-module-objects-without-some-attributes is not a problem. It should be an empty string by default. - introduce an instance
.set_names_recursively()(modulo bikeshedding) method ontorch.nn.Modulewhich would go around and produce names similar to what's now found instate_dict()keys formatting
I also propose to support setting/getting such attribute for any tensor. It's understood that most tensors won't have it assigned and the user would need to set it manually for any usefulness, but it's still good (e.g. can be set for tensors-to-be-returned-from-a-function). Propagation of such tensor names is a more complex task, and I propose that it's out of scope as the feature is already useful if only manual tensor names are supported (for the cases useful for debugging). Alternatively a tensor.name() / tensor.name(value) could be maybe used instead of the attribute so that the setter would return self for fluency and convenience, so that one can write return (x + 1).name("mytensorplus1") - although it would mean that the name(...) method would return either a string or a Tensor/sth else, depending on its argument.
Another consideration is that currently ONNX exporter already produces some automatic module/op names, and some code might have taken some dependencies on this. So the ONNX naming behavior should be unchanged unless explicitly .set_names_recursively() was called (so empty names would be overridden by existing automatic names)
Also probably if .names are set, state_dict() should use them. Another consideration is that some people are probably monkey-patching .name attributes on tensors/modules themselves, so if state_dict starts using them, it might be surprising, so maybe some attribute name bikeshedding is required
e.g. https://github.com/facebookresearch/fvcore/blob/main/fvcore/nn/jit_analysis.py is doing sth similar wrt to module names