Skip to content

New Weight Scheduler Concept for Weight Decay #22343

@vincentqb

Description

@vincentqb

🚀 Feature

Weight decay is used very often. A common strategy is to implicitly use the learning rate scheduler todo so, or to simply shrinking the weights at the end of each iteration by a constant multiplicative factor. However, one could expect to use strategies different from this. In that case, we could have weight schedulers that modify the weights using a syntax and grammar similar to learning rate schedulers already available.

Motivation

The AdamW code from #21250 does not include the weight scheduler, as mentioned here.

There are also currently a few issues and pull requests about weight schedulers, see below.

Alternatives

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureA request for a proper, new feature.module: optimizerRelated to torch.optimtriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions