-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add high order gradient support for activation function #1496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot test this please |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
@pytorchbot test this please |
|
Are there tests for double backprop that could be easily added? |
|
per-op double backprop tests are unnecessary. We only use first-order jacobian vector product functions to compute grads of any order, so as long as first-order is correct it should be all good (assuming autograd code is correct, but we have separate tests for that). |
|
That's not true for code that has a different behavior if the grad is volatile, as in this PR. |
|
i think we should add gradgradcheck, just like gradcheck. We dont know if new-style functions have been written correctly for grad of grad out of the box (for example, user may have rewrapped a Variable somewhere and thought it was okay) |
|
|
|
I think that instead of computing a full hessian of each op (these tests would be soooo slooow) we could just add some simple clauses to gradcheck that make sure that there exists a path from |
* master: Add F.normalize (pytorch#1467) Expose custom attributes from C++ functions (pytorch#1430) Add high order gradient support for Sigmoid (pytorch#1496)
* Minor fix for trivial reductions. Co-authored-by: Naoya Maruyama <[email protected]>
[WIP] Add high order gradient support for sigmoid function, solving the issue #1483