-
Notifications
You must be signed in to change notification settings - Fork 26.3k
C++/Python API Parity: add AlphaDropout and FeatureAlphaDropout #28424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
yf225
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyash458 Thanks so much for the awesome work! I left some comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyash458 Thanks so much for the awesome work! I think we can change the arguments in DropoutOptions to include p and inplace instead of rate to match the Python version better. And we can have using AlphaDropoutOptions = DropoutOptions; and then use AlphaDropoutOptions everywhere in the AlphaDropout code :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yf225 Thanks for having a look! Adding p and inplace makes sense, I'll add the extra params to DropoutOptions. This'll affect DropoutImpl and FeatureDropoutImpl as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and I think we should rename rate to p in DropoutOptions, and it is okay to break backward compatibility in favor of code simplicity and parity with Python API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe we can write
| inline Tensor alpha_dropout(const Tensor& input, const DropoutOptions& options, bool is_training) { | |
| inline Tensor alpha_dropout(const Tensor& input, const AlphaDropoutOptions& options = {}, bool training = false) { |
to match the Python names better and also more consistent with the naming of other functional options :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be awesome to express the same logic as in Python version:
if p < 0. or p > 1.:
raise ValueError("dropout probability has to be between 0 and 1, "
"but got {}".format(p))
return (_VF.alpha_dropout_(input, p, training)
if inplace
else _VF.alpha_dropout(input, p, training))There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering, wouldn't this bit of code in dropout.cpp take care of the check for the p value?
DropoutImplBase<Derived>::DropoutImplBase(const DropoutOptions& options_)
: options(options_) {
TORCH_CHECK(options.rate() >= 0, "Dropout rate must not be less than zero");
TORCH_CHECK(options.rate() <= 1, "Dropout rate must not be greater than one");
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I think it would take care of the check when the user is calling F::alpha_dropout from the AlphaDropout module, but the user might also call F::alpha_dropout directly, and they would be out of luck :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I've added the same checks in the F::alpha_dropout function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this can be
| return torch::alpha_dropout(input, options.rate(), this->is_training()); | |
| return F::alpha_dropout(input, options.p(), this->is_training()); |
to match the Python version even better :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to print both p and inplace, to match the Python version even better :D
test/cpp/api/functional.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for the awesome tests! It would be great if we could also add a test for the F::alpha_dropout(input) (i.e. no value passed for either options or training) use case :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah right, I missed that. Will add a test case for it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your help!
test/cpp/api/modules.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be awesome to add a module test similar to
pytorch/test/cpp/api/modules.cpp
Lines 897 to 911 in e31adeb
| TEST_F(ModulesTest, Dropout) { | |
| Dropout dropout(0.5); | |
| torch::Tensor x = torch::ones(100, torch::requires_grad()); | |
| torch::Tensor y = dropout(x); | |
| y.backward(torch::ones_like(y)); | |
| ASSERT_EQ(y.ndimension(), 1); | |
| ASSERT_EQ(y.size(0), 100); | |
| ASSERT_LT(y.sum().item<float>(), 130); // Probably | |
| ASSERT_GT(y.sum().item<float>(), 70); // Probably | |
| dropout->eval(); | |
| y = dropout(x); | |
| ASSERT_EQ(y.sum().item<float>(), 100); | |
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a test similar to the one for Dropout. From what I read in this paper, the sum of the output should lie between 40 and 130
New Dropout Technique. Standard dropout randomly sets an activation x to zero with probability 1 − q for 0 < q 6 1 . In order to preserve the mean, the activations are scaled by 1 /q during training. If x has mean E(x) = μ and variance Var(x) = ν , and the dropout variable d follows a binomial distribution B(1 ,q ) , then the mean E(1 /qdx ) = μ is kept. Dropout fits well to rectified linear units, since zero is in the low variance region and corresponds to the default value. For scaled exponential linear units, the default and low variance value is lim x →−∞ selu(x) = − λα = α ′ . Therefore, we propose “alpha dropout”, that randomly sets inputs to α ′ . The new mean and new variance is E(xd + α ′(1 − d)) = qμ + (1 − q)α ′ , and Var(xd + α ′(1 − d)) = q((1 − q)(α ′ − μ) 2 + ν) . We aim at keeping mean and variance to their original values after “alpha dropout”, in order to ensure the self-normalizing property even for “alpha dropout”. The affine transformation a (xd + α ′ (1 − d)) + b allows to determine parameters a and b such that mean and variance are kept to their values: E( a ( xd + α ′ (1 − d )) + b ) = μ and Var( a ( xd + α ′ (1 − d )) + b ) = ν . In contrast to dropout, a and b will depend on μ and ν , however our SNNs converge to activations with zero mean and unit variance. With μ = 0 and ν = 1 , we obtain a = (q + α′2q(1 − q )) − 1 / 2 and b = − ( q + α ′ 2 q (1 − q ) ) − 1 / 2 ((1 − q ) α ′ ) . The parameters a and b only depend on the dropout rate 1 − q and the most negative activation α
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for adding the test! :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: would be awesome to put this line above #include <torch/nn/functional/embedding.h>, following alphabetical order
|
@yf225 I have added another commit with the required changes, lemme know if they look good! |
yf225
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyash458 Thanks so much for the update! I left some minor comments.
test/cpp/api/functional.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for your help!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To strictly match the Python version, maybe we could do the following instead:
TORCH_CHECK(
options.p() >= 0 && options.p() <= 1,
"dropout probability has to be between 0 and 1, but got ", options.p());There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the Python version, I think we execute different code paths based on the value of options.inplace():
return (_VF.alpha_dropout_(input, p, training)
if inplace
else _VF.alpha_dropout(input, p, training))It would be awesome to do the same in C++ version as well (we might need to change the type of input from const Tensor& to Tensor in order to achieve this) :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Functions.h I see two declarations for alpha_dropout
static inline Tensor & alpha_dropout_(Tensor & self, double p, bool train) {...}
static inline Tensor alpha_dropout(const Tensor & self, double p, bool train) {...}Guessing we have to use the first one for in-place operation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we can remove the new line here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can put the default value for inplace as part of the TORCH_ARG declaration instead, something like
TORCH_ARG(bool, inplace) = false;instead of specifying it as default value to the constructor argument. (Ideally we should do the same for p as well, but there are probably too many people using Dropout(0.5) directly, and we have to preserve it for better backward compatibility.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I'll change the TORCH_ARG declaration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To strictly match the Python version, maybe we could do the following instead:
TORCH_CHECK(
options.p() >= 0 && options.p() <= 1,
"dropout probability has to be between 0 and 1, but got ", options.p());
test/cpp/api/modules.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for adding the test! :D
|
We will probably need to rebase the PR because of the conflicts :( |
- add `AlphaDropoutImpl` to `modules/dropout.h` and `modules/dropout.cpp` - add `functional/dropout.h` containing the `alpha_dropout` function - include `functional/dropout.h` in `nn/functional.h` - update `DropoutOptions` to have params `p` and `inplace` instead of `rate`- update `DropoutImpl` and `FeatureDropoutImpl` to use the new `DropoutOptions` - add `AlphaDropoutOptions` for use in `AlphaDropoutImpl` - update `torch::alpha_dropout` call to `F::alpha_dropout` in `modules/dropout.cpp` - add checks in `functional/dropout.cpp` to check the dropout rate - add functional/module/prettyprint tests and update existing `Dropout` tests to work with the new `DropoutOptions`
|
Rebasing to resolve conflicts in test files |
|
@yf225 is the current build failing on Linux? I get an error saying Here's the complete stack trace |
…unctional/dropout.h` - update input params for `forward()` from `const Tensor&` to `Tensor` in `DropoutImpl` - update `functional/dropout.h` to call different implementations based on the `inplace` param - move the `inplace` default value to the `TORCH_ARG` declaration
|
@yf225 lemme know if this looks fine, I'll rebase to fix the conflicts then |
facebook-github-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yf225 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
yf225
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@suyash458 Thanks a lot for the awesome work!
Glad I could help! |
AlphaDropoutImpltomodules/dropout.handmodules/dropout.cppfunctional/dropout.hcontaining thealpha_dropoutfunctionfunctional/dropout.hinnn/functional.h