Skip to content

Conversation

@vishwakftw
Copy link
Contributor

This closes #6906 .

cc: @ssnl @zou3519

@zou3519
Copy link
Contributor

zou3519 commented Jun 29, 2018

@fmassa
Copy link
Member

fmassa commented Jun 29, 2018

This is great, thanks!
Maybe for a future commit, it would be nice to use the nn.init methods for initialization, inclusive in the documentation. I think it would be simpler for the users maybe

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@soumith is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@vishwakftw
Copy link
Contributor Author

@fmassa Once this PR is merged, I will modify the initialization schemes in the modules to use nn.init. Hope that's fine.

@soumith
Copy link
Contributor

soumith commented Jul 1, 2018

@pytorchbot retest this please

@ssnl
Copy link
Collaborator

ssnl commented Jul 1, 2018

Would be great to have similar things for BatchNorm and InstanceNorm too :)

the mini-batches and :math:`\gamma` and :math:`\beta` are learnable parameter vectors
of size `C` (where `C` is the input size).
of size `C` (where `C` is the input size). By default, the elements of :math:`\gamma` are sampled
from :math:`\mathcal{U}(0, 1)` and the elements of :math:`\beta` are set to 0.

This comment was marked as off-topic.

This comment was marked as off-topic.

momentum: the value used for the running_mean and running_var computation. Default: 0.1
affine: a boolean value that when set to ``True``, this module has
learnable affine parameters. Default: ``False``
learnable affine parameters, initialized the same way as done for batch normalization.

This comment was marked as off-topic.

@vishwakftw
Copy link
Contributor Author

@fmassa I have added the initialization from nn.init .

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ssnl has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
init.uniform_(self.weight, -stdv, stdv)

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
init.uniform_(self.weight, -stdv, stdv)

This comment was marked as off-topic.

This comment was marked as off-topic.

def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
init.uniform_(self.weight, -stdv, stdv)

This comment was marked as off-topic.

This comment was marked as off-topic.

@vishwakftw
Copy link
Contributor Author

Is this good to go?

@fmassa
Copy link
Member

fmassa commented Jul 4, 2018

Hold on a bit, I'll get you the initialization schemes for you today, so that we can simplify things (forgot to do it yesterday)

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added the equivalent initialization methods (which rely on kaiming_uniform_ using fan_in).
Please double check and then make the changes so that we can finally remove the dependency on the hand-tuned (and potentially buggy) initializations.

n *= k
stdv = 1. / math.sqrt(n)
self.weight.data.uniform_(-stdv, stdv)
init.uniform_(self.weight, -stdv, stdv)

This comment was marked as off-topic.

def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
init.uniform_(self.weight, -stdv, stdv)

This comment was marked as off-topic.

def reset_parameters(self):
stdv = 1. / math.sqrt(self.weight.size(1))
self.weight.data.uniform_(-stdv, stdv)
init.uniform_(self.weight, -stdv, stdv)

This comment was marked as off-topic.

self.bias.data.uniform_(-stdv, stdv)
fan_in, _ = init._calculate_fan_in_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.bias, -bound, bound)

This comment was marked as off-topic.

self.bias.data.uniform_(-stdv, stdv)
fan_in, _ = init._calculate_fan_in_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.bias, -bound, bound)

This comment was marked as off-topic.

self.weight.data.uniform_(-stdv, stdv)
fan_in, _ = init._calculate_fan_in_fan_out(self.weight)
bound = 1 / math.sqrt(fan_in)
init.uniform_(self.weight, -bound, bound)

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

This comment was marked as off-topic.

Copy link
Member

@fmassa fmassa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few things that still look a bit weird, but which might require doing backward-incompatible changes to modify, so I'm ok with how it looks now. Thanks!

@soumith
Copy link
Contributor

soumith commented Jul 4, 2018

@pytorchbot test this please

@vishwakftw
Copy link
Contributor Author

I have one minor concern: would this make the initialization slower in the case of Conv and Linear, by having to the compute fan_in and fan_out twice?

@fmassa
Copy link
Member

fmassa commented Jul 4, 2018

Computing fan_in and fan_out is practically a free operation, and only involves a few multiplications, so I wouldn't worry about it.

@vishwakftw
Copy link
Contributor Author

Is this good to go?

@ezyang
Copy link
Contributor

ezyang commented Jul 8, 2018

Sadly, it seems to be failing tests now.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ezyang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@vishwakftw
Copy link
Contributor Author

Oh, that was my bad. I have fixed them now.

@vishwakftw
Copy link
Contributor Author

Is this good to go?

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@weiyangfb has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@zou3519
Copy link
Contributor

zou3519 commented Jul 11, 2018

@vishwakftw should be good to go, it looks like @weiyangfb is working on merging it.

@vishwakftw
Copy link
Contributor Author

@weiyangfb gentle reminder. Sorry.

@vishwakftw
Copy link
Contributor Author

Is this good to go?

facebook-github-bot pushed a commit that referenced this pull request Jul 24, 2018
Summary: This closes #6906 .

Reviewed By: ezyang

Differential Revision: D8698632

Pulled By: weiyangfb

fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
@vishwakftw vishwakftw closed this Jul 24, 2018
@vishwakftw vishwakftw deleted the default-init-scheme-docs branch July 24, 2018 19:05
jramseyer pushed a commit to jramseyer/pytorch that referenced this pull request Jul 30, 2018
)

Summary: This closes pytorch#6906 .

Reviewed By: ezyang

Differential Revision: D8698632

Pulled By: weiyangfb

fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
goodlux pushed a commit to goodlux/pytorch that referenced this pull request Aug 15, 2018
)

Summary: This closes pytorch#6906 .

Reviewed By: ezyang

Differential Revision: D8698632

Pulled By: weiyangfb

fbshipit-source-id: 259c1dbdc264a8e9f83e196fa72d135babd97d48
@Atcold
Copy link
Contributor

Atcold commented Nov 15, 2018

Why Kaiming over Xavier?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Can we add the default initialization of different modules in the docs?

8 participants