-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
🐛 Bug
The use of Optimizer.add_param_group is not properly supported everywhere: in AdaGrad the internal state is not updated when additional parameter groups are added; schedulers also seem to suffer from this.
To Reproduce
Steps to reproduce the behavior:
- Use AdaGrad optimizer.
- Use
add_param_groupto add additional parameter groups. - Run the optimizer, it will result in a
KeyError('sum')on lineorpytorch/torch/optim/adagrad.py
Line 51 in 57bffc3
state['sum'].share_memory_() because the state was intialized in the constructor, before the parameter group was added.pytorch/torch/optim/adagrad.py
Line 77 in 57bffc3
state_sums.append(state['sum'])
Alternatively:
Use ReduceLROnPlateau scheduler, which maintains a min_lrs list that is only as long as the number of parameter_groups at initialization, resulting in a IndexError: list index out of range when the learning rate of a dynamically added parameter group is attempted to be reduced.
Expected behavior
Presumably the AdaGrad optimizer should override add_param_group and update self.state.
The ReduceLROnPlateau should probably use a dict with a default or some kind of Observer pattern should be implemented so it is made aware of additional parameter groups.
Environment
PyTorch version: 1.7.1
Is debug build: False
CUDA used to build PyTorch: 10.2
ROCM used to build PyTorch: N/A
OS: Debian GNU/Linux 10 (buster) (x86_64)
GCC version: (GCC) 9.3.0
Clang version: Could not collect
CMake version: Could not collect
Python version: 3.8 (64-bit runtime)
Is CUDA available: True
CUDA runtime version: Could not collect
GPU models and configuration:
GPU 0: GeForce GTX 1080 Ti
GPU 1: GeForce GTX 1080 Ti
GPU 2: GeForce GTX 1080 Ti
GPU 3: GeForce GTX 1080 Ti
Nvidia driver version: 460.32.03
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Versions of relevant libraries:
[pip3] botorch==0.3.3
[pip3] gpytorch==1.3.0
[pip3] numpy==1.19.4
[pip3] torch==1.7.1
[pip3] torchviz==0.0.1
[conda] Could not collect
cc @vincentqb