-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Closed
Description
🐛 Feature suggestion
Pytorch Batch normalization currently initializes it's scalar weight parameter with random uniform variables (
pytorch/torch/nn/modules/batchnorm.py
Line 46 in 1d3f650
| init.uniform_(self.weight) |
Part of the usefulness of Batchnorm is that it can initialize the network easily, with inputs to hidden units normalized to zero norm and unit variance (original paper: https://arxiv.org/abs/1502.03167) and it is also how recent studies use it (e.g. https://arxiv.org/abs/1706.02677).
To change
From
init.uniform_(self.weight)
to
init.ones_(self.weight)
Kaixhin, mirceamironenco, maksak and rkaplan
Metadata
Metadata
Assignees
Labels
No labels