-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Specify default initialization schemes for modules in docs #9038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This is great, thanks! |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@fmassa Once this PR is merged, I will modify the initialization schemes in the modules to use |
|
@pytorchbot retest this please |
|
Would be great to have similar things for BatchNorm and InstanceNorm too :) |
| the mini-batches and :math:`\gamma` and :math:`\beta` are learnable parameter vectors | ||
| of size `C` (where `C` is the input size). | ||
| of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are sampled | ||
| from :math:`\mathcal{U}(0, 1)` and the elements of :math:`\beta` are set to 0. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| momentum: the value used for the running_mean and running_var computation. Default: 0.1 | ||
| affine: a boolean value that when set to ``True``, this module has | ||
| learnable affine parameters. Default: ``False`` | ||
| learnable affine parameters, initialized the same way as done for batch normalization. |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@fmassa I have added the initialization from nn.init . |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ssnl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
torch/nn/modules/conv.py
Outdated
| n *= k | ||
| stdv = 1. / math.sqrt(n) | ||
| self.weight.data.uniform_(-stdv, stdv) | ||
| init.uniform_(self.weight, -stdv, stdv) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
| def reset_parameters(self): | ||
| stdv = 1. / math.sqrt(self.weight.size(1)) | ||
| self.weight.data.uniform_(-stdv, stdv) | ||
| init.uniform_(self.weight, -stdv, stdv) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
| def reset_parameters(self): | ||
| stdv = 1. / math.sqrt(self.weight.size(1)) | ||
| self.weight.data.uniform_(-stdv, stdv) | ||
| init.uniform_(self.weight, -stdv, stdv) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Is this good to go? |
|
Hold on a bit, I'll get you the initialization schemes for you today, so that we can simplify things (forgot to do it yesterday) |
fmassa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the equivalent initialization methods (which rely on kaiming_uniform_ using fan_in).
Please double check and then make the changes so that we can finally remove the dependency on the hand-tuned (and potentially buggy) initializations.
torch/nn/modules/conv.py
Outdated
| n *= k | ||
| stdv = 1. / math.sqrt(n) | ||
| self.weight.data.uniform_(-stdv, stdv) | ||
| init.uniform_(self.weight, -stdv, stdv) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
| def reset_parameters(self): | ||
| stdv = 1. / math.sqrt(self.weight.size(1)) | ||
| self.weight.data.uniform_(-stdv, stdv) | ||
| init.uniform_(self.weight, -stdv, stdv) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/linear.py
Outdated
| def reset_parameters(self): | ||
| stdv = 1. / math.sqrt(self.weight.size(1)) | ||
| self.weight.data.uniform_(-stdv, stdv) | ||
| init.uniform_(self.weight, -stdv, stdv) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| self.bias.data.uniform_(-stdv, stdv) | ||
| fan_in, _ = init._calculate_fan_in_fan_out(self.weight) | ||
| bound = 1 / math.sqrt(fan_in) | ||
| init.uniform_(self.bias, -bound, bound) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| self.bias.data.uniform_(-stdv, stdv) | ||
| fan_in, _ = init._calculate_fan_in_fan_out(self.weight) | ||
| bound = 1 / math.sqrt(fan_in) | ||
| init.uniform_(self.bias, -bound, bound) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
| self.weight.data.uniform_(-stdv, stdv) | ||
| fan_in, _ = init._calculate_fan_in_fan_out(self.weight) | ||
| bound = 1 / math.sqrt(fan_in) | ||
| init.uniform_(self.weight, -bound, bound) |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
fmassa
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a few things that still look a bit weird, but which might require doing backward-incompatible changes to modify, so I'm ok with how it looks now. Thanks!
|
@pytorchbot test this please |
|
I have one minor concern: would this make the initialization slower in the case of |
|
Computing |
|
Is this good to go? |
|
Sadly, it seems to be failing tests now. |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
Oh, that was my bad. I have fixed them now. |
|
Is this good to go? |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
|
@vishwakftw should be good to go, it looks like @weiyangfb is working on merging it. |
|
@weiyangfb gentle reminder. Sorry. |
|
Is this good to go? |
Summary: This closes #6906 . Reviewed By: ezyang Differential Revision: D8698632 Pulled By: weiyangfb fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
) Summary: This closes pytorch#6906 . Reviewed By: ezyang Differential Revision: D8698632 Pulled By: weiyangfb fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
) Summary: This closes pytorch#6906 . Reviewed By: ezyang Differential Revision: D8698632 Pulled By: weiyangfb fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
|
Why Kaiming over Xavier? |
This closes #6906 .
cc: @ssnl @zou3519