-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[MPS] Fixes GELU, LeakyRELU and MISH on non-contiguous tensors #123049
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/123049
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (2 Unrelated Failures)As of commit df59052 with merge base 14162ee ( FLAKY - The following jobs failed but were likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
0ff6d76 to
3933af5
Compare
|
@malfet @kulinseth Could you have a look? Thanks! |
|
A test failed but it seems unrelated to this PR, should I rebase on viable/strict? |
|
@pytorchbot --help |
PyTorchBot HelpMergeRevertRebaseLabelDr CIcherry-pickCloseusage: @pytorchbot close Close a PR [Can be used on issues] |
|
@pytorchbot rebase |
Test is indeed unrelated, i have kicked off a rebase. |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
…RELU and Mish activations
|
Successfully rebased |
f0493f3 to
df59052
Compare
I will keep an eye out, if there is still a problem, I will merge the PR. Otherwise if it looks green please go ahead and merge @jtang98 |
|
@pytorchbot merge -f “all tests are green” |
|
❌ 🤖 pytorchbot command failed: Try |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…ch#123049) Fixes GELU, LeakyRELU and MISH activation functions on non-contiguous tensors (for instance, when a transpose operation was applied on the tensors prior to the MPS operator), forward and backward passes. I also extended tests on the 3 activation functions to check: full-precision and half-precision, contiguous and non-contiguous, and several dims of tensors: scalars, 1D, empty, 2D, > 3D. I had issues with Mish and GELU activations when asserting the gradients vs. CPU with sum() on some cases, so I reverted to the previous setup by setting a gradient parameter on .backwards(). This PR also fixes an issue with LeakyRELU on empty tensors. Fixes pytorch#98212 huggingface/transformers#22468 huggingface/transformers#19353 Pull Request resolved: pytorch#123049 Approved by: https://github.com/kulinseth
…ch#123049) Fixes GELU, LeakyRELU and MISH activation functions on non-contiguous tensors (for instance, when a transpose operation was applied on the tensors prior to the MPS operator), forward and backward passes. I also extended tests on the 3 activation functions to check: full-precision and half-precision, contiguous and non-contiguous, and several dims of tensors: scalars, 1D, empty, 2D, > 3D. I had issues with Mish and GELU activations when asserting the gradients vs. CPU with sum() on some cases, so I reverted to the previous setup by setting a gradient parameter on .backwards(). This PR also fixes an issue with LeakyRELU on empty tensors. Fixes pytorch#98212 huggingface/transformers#22468 huggingface/transformers#19353 Pull Request resolved: pytorch#123049 Approved by: https://github.com/kulinseth
Similar to pytorch#123049, however, `SiLU` also produces random values, `0.0`, or `NaN` as results if input tensor is not contiguous on prior to macOS 15.0.
Similar to #123049, however, `SiLU` also produces random values, `0.0`, or `NaN` as results if input tensor is not contiguous on prior to macOS 15.0. Orignally the problem was found at jy0205/Pyramid-Flow#113. Pull Request resolved: #139006 Approved by: https://github.com/malfet
Similar to pytorch#123049, however, `SiLU` also produces random values, `0.0`, or `NaN` as results if input tensor is not contiguous on prior to macOS 15.0. Orignally the problem was found at jy0205/Pyramid-Flow#113. Pull Request resolved: pytorch#139006 Approved by: https://github.com/malfet
Fixes GELU, LeakyRELU and MISH activation functions on non-contiguous tensors (for instance, when a transpose operation was applied on the tensors prior to the MPS operator), forward and backward passes.
I also extended tests on the 3 activation functions to check: full-precision and half-precision, contiguous and non-contiguous, and several dims of tensors: scalars, 1D, empty, 2D, > 3D.
I had issues with Mish and GELU activations when asserting the gradients vs. CPU with sum() on some cases, so I reverted to the previous setup by setting a gradient parameter on .backwards().
This PR also fixes an issue with LeakyRELU on empty tensors.
Fixes #98212 huggingface/transformers#22468 huggingface/transformers#19353