New Weight Scheduler Concept for Weight Decay

## 🚀 Feature

Weight decay is used very often. A common strategy is to implicitly use the learning rate scheduler todo so, or to simply shrinking the weights at the end of each iteration by a constant multiplicative factor. However, one could expect to use strategies different from this. In that case, we could have weight schedulers that modify the weights using a syntax and grammar similar to learning rate schedulers already available. 

## Motivation

The AdamW code from #21250 does not include the weight scheduler, as mentioned [here](https://github.com/pytorch/pytorch/pull/4429#issuecomment-508951336).

There are also currently a few issues and pull requests about weight schedulers, see below.

## Alternatives

* #4429 suggests modifying the optimizer logic to accept a new parameter `weight_decay` specifying the constant multiplicative factor to use.
* #3740, #21250, #22163 introduce variations on Adam and other optimizers with a corresponding built-in weight decay. #3790 is requesting some of these to be supported.
* We could instead have a new "weight_decay_type" option to those optimizers to switch between common strategies.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New Weight Scheduler Concept for Weight Decay #22343

🚀 Feature

Motivation

Alternatives

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

New Weight Scheduler Concept for Weight Decay #22343

Description

🚀 Feature

Motivation

Alternatives

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions