-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Relax tolerance for test_quick_baddbmm_cpu_complex64 #152424
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/152424
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit b489969 with merge base 6737e2c ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-focal-rocm-py3.10 / test (default, 1, 2, linux.rocm.gpu.2) Details for Dev Infra teamRaised by workflow job |
|
This is the failure:
hanging in the ROCM setup step. Can we ignore that for the merge? |
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
Merge failedReason: 1 jobs have failed, first few of them are: trunk / linux-focal-rocm-py3.10 / test (default, 1, 2, linux.rocm.gpu.2) Details for Dev Infra teamRaised by workflow job |
|
@Skylion007 The failure seems to be unrelated and there is no message what exactly failed. What shall we do with that? |
|
Looks like this PR hasn't been updated in a while so we're going to go ahead and mark this as |
|
Any updates here? |
|
@pytorchbot rebase |
|
@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here |
On Zen 2 (AMD EPYC) and Intel Sapphire Rapids this fails with small differences when compiled with native targeted optimizations. I.e. it fails with `-march=znver2` but succeeds with `-march=znver1`.
|
Successfully rebased |
19c6b89 to
b489969
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
On Zen 2 (AMD EPYC) and Intel Sapphire Rapids this fails with small differences when compiled with native targeted optimizations. I.e. it fails with `-march=znver2` but succeeds with `-march=znver1`. I assume some operator fusing is being used by GCC. Small differences like using `vmovdqa` can be seen in the minimized code of the baddbmm kernel: https://godbolt.org/z/jsxMa91Wb The greatest differences are consistent and the same on both CPU architectures: ``` Greatest absolute difference: 3.43852152582258e-05 at index (1, 2, 1) (up to 1e-05 allowed) Greatest relative difference: 3.6034286949870875e-06 at index (1, 2, 1) (up to 1.3e-06 allowed) ``` Hence I assume this is in the expected tolerances especially as `complex128` and all other types pass. Pull Request resolved: pytorch#152424 Approved by: https://github.com/malfet
On Zen 2 (AMD EPYC) and Intel Sapphire Rapids this fails with small differences when compiled with native targeted optimizations. I.e. it fails with `-march=znver2` but succeeds with `-march=znver1`. I assume some operator fusing is being used by GCC. Small differences like using `vmovdqa` can be seen in the minimized code of the baddbmm kernel: https://godbolt.org/z/jsxMa91Wb The greatest differences are consistent and the same on both CPU architectures: ``` Greatest absolute difference: 3.43852152582258e-05 at index (1, 2, 1) (up to 1e-05 allowed) Greatest relative difference: 3.6034286949870875e-06 at index (1, 2, 1) (up to 1.3e-06 allowed) ``` Hence I assume this is in the expected tolerances especially as `complex128` and all other types pass. Pull Request resolved: pytorch#152424 Approved by: https://github.com/malfet
On Zen 2 (AMD EPYC) and Intel Sapphire Rapids this fails with small differences when compiled with native targeted optimizations. I.e. it fails with `-march=znver2` but succeeds with `-march=znver1`. I assume some operator fusing is being used by GCC. Small differences like using `vmovdqa` can be seen in the minimized code of the baddbmm kernel: https://godbolt.org/z/jsxMa91Wb The greatest differences are consistent and the same on both CPU architectures: ``` Greatest absolute difference: 3.43852152582258e-05 at index (1, 2, 1) (up to 1e-05 allowed) Greatest relative difference: 3.6034286949870875e-06 at index (1, 2, 1) (up to 1.3e-06 allowed) ``` Hence I assume this is in the expected tolerances especially as `complex128` and all other types pass. Pull Request resolved: pytorch#152424 Approved by: https://github.com/malfet
On Zen 2 (AMD EPYC) and Intel Sapphire Rapids this fails with small differences when compiled with native targeted optimizations. I.e. it fails with
-march=znver2but succeeds with-march=znver1.I assume some operator fusing is being used by GCC. Small differences like using
vmovdqacan be seen in the minimized code of the baddbmm kernel: https://godbolt.org/z/jsxMa91WbThe greatest differences are consistent and the same on both CPU architectures:
Hence I assume this is in the expected tolerances especially as
complex128and all other types pass.