-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Add SELU activation function #1769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
apaszke
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All contbuilds are failing
| def __init__(self, inplace=False): | ||
| super(SELU, self).__init__(inplace) | ||
| self.alpha = 1.6732632423543772848170429916717 | ||
| self.scale = 1.0507009873554804934193349852946 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
torch/nn/modules/activation.py
Outdated
| More details can be found in the paper `Self-Normalizing Neural Networks`_ . | ||
| Args: | ||
| inplace: can optionally do the operation in-place |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
Yeah, I just saw that the tests are failing because there is no |
|
No, you only have to move the test spec from |
It also supports double backprop, verifyed with gradgradcheck
|
Added new-style Function to |
|
|
||
| class SELU(InplaceFunction): | ||
| alpha = 1.6732632423543772848170429916717 | ||
| scale = 1.0507009873554804934193349852946 |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
…08e7e3 Summary: Previous import was dc75285d4a1cff9618400164dfdb26c5a1bab70a Included changes: - **[15c33c9](onnx/onnx@15c33c9)**: Add ppc64le build (pytorch#1768) <Chin Huang> - **[198f840](onnx/onnx@198f840)**: Update Broadcasting.md (pytorch#1769) <Verma-Rajat> - **[60ac95f](onnx/onnx@60ac95f)**: Merge back from release 1.4.1 (pytorch#1767) <Raymond Yang> - **[a683372](onnx/onnx@a683372)**: Bump up version number for v1.4.0 (pytorch#1761) (pytorch#1763) <Raymond Yang> - **[dbf3581](onnx/onnx@dbf3581)**: Add TfIdfVectorizer operator to ONNX (pytorch#1721) <Dmitri Smirnov> Differential Revision: D13858840 fbshipit-source-id: 90b2e21c80de4936507a27fc93d0879128ab4fb7
…08e7e3 (#16493) Summary: Pull Request resolved: #16493 Previous import was dc75285d4a1cff9618400164dfdb26c5a1bab70a Included changes: - **[15c33c9](onnx/onnx@15c33c9)**: Add ppc64le build (#1768) <Chin Huang> - **[198f840](onnx/onnx@198f840)**: Update Broadcasting.md (#1769) <Verma-Rajat> - **[60ac95f](onnx/onnx@60ac95f)**: Merge back from release 1.4.1 (#1767) <Raymond Yang> - **[a683372](onnx/onnx@a683372)**: Bump up version number for v1.4.0 (#1761) (#1763) <Raymond Yang> - **[dbf3581](onnx/onnx@dbf3581)**: Add TfIdfVectorizer operator to ONNX (#1721) <Dmitri Smirnov> Reviewed By: zrphercule Differential Revision: D13858840 fbshipit-source-id: 1d00f63f265cc6deed965b92ed00c44f547ff03e
This refactors `TransformPropagator` as a 3-level abstraction: `TransformPropagator` is a subclass of `MaxRootDomainInfoPropagator`, while `MaxRootDomainInfoPropagator` is a subclass of `MaxInfoPropagator`. `MaxInfoPropagator` implements the Dijkstra algorithm for propagating on the DAG, but it has no knowledge what "information" we are trying to preserve and what we are propagating. `MaxRootDomainInfoPropagator` inherits `MaxInfoPropagator` for preserving most root/rfactor domain information. But it does not have any knowledge about what we are propagating. `TransformPropagator` further inherits `MaxRootDomainInfoPropagator` and propagates transformations of leaf IDs.
Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Approved by: https://github.com/davidberard98
Summary: Syncing nvfuser devel branch to upstream master. https://github.com/csarofeen/pytorch/ Code changes includes: - TransformPropagator refactor: switched to Dijkstra instead of exhaustive enumeration on all possible paths to reduce compilation time on transform propagation; - Indexing refactor: remove reference tensor creation in all tensor indexing logic (#1690) - (more) generic grouped grid reduction kernel; - Minor parser/fuser patches: 1. zero-dim tensor reduction support 3. no-op binary removal within fused graph 4. expand supported in fusion Squashed commits to WAR github API Commits that's actually in this PR from the devel branch: ``` a054b3e Refactor TransormPropagator to allow specifying a position and propagating to part of the DAG (#1775) d67e1cd Indexing refactor stage 1: remove reference tensor creation in all tensor indexing logic (#1690) 1b65299 Issue 1770 (#1774) 35b0427 Avoid compilation errors like below: (#1773) 452c773 Ignore reductions of zero-dim tensors per PyTorch conventions (#1771) 31d6c56 TransformPropagator refactor (#1769) 570c5a8 Merge pull request #1767 from csarofeen/upstream_merge_0621 9d6c3d8 merging upstream 61305cd 0ed815f New TransformPropagator algorithm (#1763) 6c19520 no-op binary removal (#1764) ec7fa41 Proper propagation of IterType (#1762) b263562 Fix dimensionality check (#1759) 2d6343f More generic grouped grid reduction kernel (#1740) 64e2b56 [nvfuser] prevent spamming warning message (#77777) (#1758) 0c43162 [nvFuser] Improving bitwise ops support (#77158) (#1757) b93a147 Parser expand (#1754) ``` RUN_TORCHBENCH: nvfuser Pull Request resolved: #80355 Reviewed By: qihqi Differential Revision: D37573400 Pulled By: davidberard98 fbshipit-source-id: 52ab68d89ec01ef61f69f5abeb18c9d3a312aa64
Fixes #1768
Also fixes a problem with RReLU in inplace mode.