Skip to content

Set Batchnorm weight scalar initialization to unit (not random uniform) #12259

@cstein06

Description

@cstein06

🐛 Feature suggestion

Pytorch Batch normalization currently initializes it's scalar weight parameter with random uniform variables (

init.uniform_(self.weight)
). This initialization has no theoretical basis, while on the other hand it makes sense to use a fixed value of 1.

Part of the usefulness of Batchnorm is that it can initialize the network easily, with inputs to hidden units normalized to zero norm and unit variance (original paper: https://arxiv.org/abs/1502.03167) and it is also how recent studies use it (e.g. https://arxiv.org/abs/1706.02677).

To change

From
init.uniform_(self.weight)

to
init.ones_(self.weight)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions