Skip to content

EMA optimizer: class-form and function-form (using new foreach_lerp) - can be used for explicit robust updates of BatchNorm stats #71683

@vadimkantorov

Description

@vadimkantorov

🚀 The feature, motivation and pitch

I'd like to simplify the exponential moving average optimization update for teacher-student model:

for param_teacher, param_student in zip(self.teacher_encoder_proj.parameters(), self.student_encoder_proj.parameters()):
           param_teacher.data.lerp_(param_student.data, 1 - param_momentum)

I was looking for a foreach version of lerp used in https://github.com/pytorch/pytorch/blob/acb340de75367985164a67234be7ee7ddf4b546e/torch/optim/_multi_tensor/_functional.py and found only :

torch._foreach_mul_(exp_avg_sq, beta2)
torch._foreach_addcmul_(exp_avg_sq, grads, grads, 1 - beta2)

Not sure if this is indeed lerp, but if it is, it would be beneficial to have a _foreach_lerp_ that has a simpler API and saves the number of kernel calls.

It would also be nice to have the available foreach method list documented somewhere even with a notice that they are experimental and are subject to change: https://pytorch.org/docs/stable/search.html?q=foreach&check_keywords=yes&area=default#

Alternatives

No response

Additional context

No response

cc @vincentqb @jbschlosser @albanD @janeyx99 @crcrpar @mcarilli @ngimel

Metadata

Metadata

Assignees

No one assigned

    Labels

    function requestA request for a new function or the addition of new arguments/modes to an existing function.module: mtaIssues related to multi-tensor apply kernels and foreach functionsmodule: optimizerRelated to torch.optimtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions