After Pytorch 1.7, Optimizer adds a parameter `set_to_none`. When setting to True, it may increase the performance. Some references: https://pytorch.org/docs/stable/generated/torch.optim.Optimizer.zero_grad.html https://pytorch.org/tutorials/recipes/recipes/tuning_guide.html