Skip to content

Optimizer should track parameter names and not id #1489

@nitishgupta

Description

@nitishgupta

In the optimizer's param_groups['params'] the order of the parameters (in which they were given to the optimizer's init) matters.
In load_state_dict the snippet shows this :

id_map = {old_id: p for old_id, p in
  zip(chain(*(g['params'] for g in saved_groups)),
    chain(*(g['params'] for g in groups)))}

state = {id_map.get(k, k): v for k, v in state_dict['state'].items()}

If we change the order in which the parameters are given to the optimizer when loading, the code breaks as the state dict now incorrectly maps parameters to their states.

Consider model (when using, say, Adam optimizer)

class Model(nn.Module):
    def __init__(self):
    super(Model, self).__init__()
    self.p1 = nn.Linear(2,3, False)
    self.p2 = nn.Linear(3,4, False)

After saving, if the order in which the parameters are defined in the model changes i.e. if I change the class to have

self.p2 = nn.Linear(3,4, False)
self.p1 = nn.Linear(2,3, False)

the loaded optimizer's state for p1 will be mapped to p2 and vice-versa. I tried this and this indeed happens which is wrong and now training cannot proceed (step() will, rightly so, give an error).
The nn.Module class is robust to such behavior as it uses parameter names instead of id ordering.

IMO the optimizer should also use parameter names instead of ids and relying on the ordering in which they are supplied to the optimizer when initializing.

Corresponding PyTorch-Discuss post

@soumith

cc @vincentqb

Metadata

Metadata

Assignees

No one assigned

    Labels

    module: optimizerRelated to torch.optimtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions