-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[CI] Disable some tests that are failing in periodic #150059
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/150059
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 76 PendingAs of commit b069100 with merge base 85e4e51 ( NEW FAILURE - The following job has failed:
UNSTABLE - The following jobs are marked as unstable, possibly due to flakiness on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
4095f65 to
aa32624
Compare
ZainRizvi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prob best to share the full list of failures on treehuggers as well (in addition to merging this PR). Someone might decide they actually care about fixing some of these test.
| if os.getenv("ATEN_CPU_CAPABILITY") in ("default", "avx2"): | ||
| # This test is not supported on ARM | ||
| print("Skipping due to failing when cuda build runs on non cuda machine, see #150059 for example") | ||
| sys.exit() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tried using pytest.skip(allow_module_level=True) (docs)? That might let you properly skip all tests (and have them be marked as skipped) instead of this silent skip
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried this on a dummy test file, unfortunately it just says 1 skipped for the entire file and I can't seem to get it to print the skip reason either
|
|
||
| @dtypes(torch.float) | ||
| @unittest.skipIf( | ||
| TEST_CUDA_MEM_LEAK_CHECK, "Leaking memory, see #150059 for example" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you include the full link here for the benefit of future devs?
atalman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. Thank you very much for this
|
@pytorchbot merge -f "previous run passed, most recent commit mostly just comment changes + 1 more test" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
|
@pytorchbot cherry-pick --onto release/2.7 --c regression |
Cherry picking #150059Command Details for Dev Infra teamRaised by workflow job |
Disabling some tests to restore periodic nogpu avx512 timeout: https://hud.pytorch.org/pytorch/pytorch/commit/59f14d19aea4091c65cca2417c509e3dbf60c0ed#38492953496-box profiler failure: https://hud.pytorch.org/pytorch/pytorch/commit/7ae0ce6360b6e4f944906502d20da24c04debee5#38461255009-box test_accelerator failure: https://hud.pytorch.org/pytorch/pytorch/commit/87bfd66c3c7061db6d36d8daa62f08f507f90e39#39476723746-box origin: 146098 test_overrides failure: https://hud.pytorch.org/pytorch/pytorch/commit/bf752c36da08871d76a66fd52ad09f87e66fc770#39484562957-box origin: 146098 inductor cpu repro: https://hud.pytorch.org/pytorch/pytorch/commit/bb9c4260249ea0c57e87395eff5271fb479efb6a#38447525659-box functorch eager transforms: https://hud.pytorch.org/pytorch/pytorch/commit/8f858e226ba81fde41d39aa34f1fd4cb4a4ecc51#39488068620-box https://hud.pytorch.org/pytorch/pytorch/commit/f2cea01f7195e59abd154b5551213ee3e38fa40d#39555064878 https://hud.pytorch.org/pytorch/pytorch/commit/b5281a4a1806c978e34c5cfa0befd298e469b7fd#39599355600 either 148288 or 148261? https://hud.pytorch.org/hud/pytorch/pytorch/2ec9aceaeb77176c4bdeb2d008a34cba0cd57e3c/1?per_page=100&name_filter=periodic&mergeLF=true Pull Request resolved: pytorch#150059 Approved by: https://github.com/ZainRizvi, https://github.com/atalman, https://github.com/malfet
…50059 (#150327) * [CI] Disable some tests that are failing in periodic (#150059) Disabling some tests to restore periodic nogpu avx512 timeout: https://hud.pytorch.org/pytorch/pytorch/commit/59f14d19aea4091c65cca2417c509e3dbf60c0ed#38492953496-box profiler failure: https://hud.pytorch.org/pytorch/pytorch/commit/7ae0ce6360b6e4f944906502d20da24c04debee5#38461255009-box test_accelerator failure: https://hud.pytorch.org/pytorch/pytorch/commit/87bfd66c3c7061db6d36d8daa62f08f507f90e39#39476723746-box origin: 146098 test_overrides failure: https://hud.pytorch.org/pytorch/pytorch/commit/bf752c36da08871d76a66fd52ad09f87e66fc770#39484562957-box origin: 146098 inductor cpu repro: https://hud.pytorch.org/pytorch/pytorch/commit/bb9c4260249ea0c57e87395eff5271fb479efb6a#38447525659-box functorch eager transforms: https://hud.pytorch.org/pytorch/pytorch/commit/8f858e226ba81fde41d39aa34f1fd4cb4a4ecc51#39488068620-box https://hud.pytorch.org/pytorch/pytorch/commit/f2cea01f7195e59abd154b5551213ee3e38fa40d#39555064878 https://hud.pytorch.org/pytorch/pytorch/commit/b5281a4a1806c978e34c5cfa0befd298e469b7fd#39599355600 either 148288 or 148261? https://hud.pytorch.org/hud/pytorch/pytorch/2ec9aceaeb77176c4bdeb2d008a34cba0cd57e3c/1?per_page=100&name_filter=periodic&mergeLF=true Pull Request resolved: #150059 Approved by: https://github.com/ZainRizvi, https://github.com/atalman, https://github.com/malfet * disable_CompiledOptimizerParityTests * Update test/inductor/test_compiled_optimizers.py --------- Co-authored-by: Catherine Lee <[email protected]> Co-authored-by: Nikita Shulga <[email protected]>
Disabling some tests to restore periodic nogpu avx512 timeout: https://hud.pytorch.org/pytorch/pytorch/commit/59f14d19aea4091c65cca2417c509e3dbf60c0ed#38492953496-box profiler failure: https://hud.pytorch.org/pytorch/pytorch/commit/7ae0ce6360b6e4f944906502d20da24c04debee5#38461255009-box test_accelerator failure: https://hud.pytorch.org/pytorch/pytorch/commit/87bfd66c3c7061db6d36d8daa62f08f507f90e39#39476723746-box origin: 146098 test_overrides failure: https://hud.pytorch.org/pytorch/pytorch/commit/bf752c36da08871d76a66fd52ad09f87e66fc770#39484562957-box origin: 146098 inductor cpu repro: https://hud.pytorch.org/pytorch/pytorch/commit/bb9c4260249ea0c57e87395eff5271fb479efb6a#38447525659-box functorch eager transforms: https://hud.pytorch.org/pytorch/pytorch/commit/8f858e226ba81fde41d39aa34f1fd4cb4a4ecc51#39488068620-box https://hud.pytorch.org/pytorch/pytorch/commit/f2cea01f7195e59abd154b5551213ee3e38fa40d#39555064878 https://hud.pytorch.org/pytorch/pytorch/commit/b5281a4a1806c978e34c5cfa0befd298e469b7fd#39599355600 either 148288 or 148261? https://hud.pytorch.org/hud/pytorch/pytorch/2ec9aceaeb77176c4bdeb2d008a34cba0cd57e3c/1?per_page=100&name_filter=periodic&mergeLF=true Pull Request resolved: pytorch#150059 Approved by: https://github.com/ZainRizvi, https://github.com/atalman, https://github.com/malfet
test_torch.py::TestTorchDeviceTypeCUDA::test_storage_use_count_cuda: Added in #150059 Fails in debug mode [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44706020831) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) inductor/test_inductor_freezing.py::FreezingGpuTests::test_cpp_wrapper_cuda: [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44707119967) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) started failing after moving to new cuda version #155234 I'll ping people if this gets merged Pull Request resolved: #156731 Approved by: https://github.com/huydhn
test_torch.py::TestTorchDeviceTypeCUDA::test_storage_use_count_cuda: Added in pytorch#150059 Fails in debug mode [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44706020831) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) inductor/test_inductor_freezing.py::FreezingGpuTests::test_cpp_wrapper_cuda: [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44707119967) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) started failing after moving to new cuda version pytorch#155234 I'll ping people if this gets merged Pull Request resolved: pytorch#156731 Approved by: https://github.com/huydhn
test_torch.py::TestTorchDeviceTypeCUDA::test_storage_use_count_cuda: Added in #150059 Fails in debug mode [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44706020831) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) inductor/test_inductor_freezing.py::FreezingGpuTests::test_cpp_wrapper_cuda: [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44707119967) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) started failing after moving to new cuda version #155234 I'll ping people if this gets merged Pull Request resolved: #156731 Approved by: https://github.com/huydhn (cherry picked from commit 2ff3280)
[ez] Disable some failing periodic tests (#156731) test_torch.py::TestTorchDeviceTypeCUDA::test_storage_use_count_cuda: Added in #150059 Fails in debug mode [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44706020831) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) inductor/test_inductor_freezing.py::FreezingGpuTests::test_cpp_wrapper_cuda: [GH job link](https://github.com/pytorch/pytorch/actions/runs/15856606665/job/44707119967) [HUD commit link](https://hud.pytorch.org/pytorch/pytorch/commit/4491326fb0c0e67eca1598ae33c41cdfced2cd33) started failing after moving to new cuda version #155234 I'll ping people if this gets merged Pull Request resolved: #156731 Approved by: https://github.com/huydhn (cherry picked from commit 2ff3280) Co-authored-by: Catherine Lee <[email protected]>
| Tensor = torch.Tensor | ||
|
|
||
| if os.getenv("ATEN_CPU_CAPABILITY") in ("default", "avx2"): | ||
| # This test is not supported on ARM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clee2000 Just came across this: The comment seems wrongly pasted from somewhere else.
I assume this test is supposed to be skipped when no GPU is available. So wouldn't checking TEST_CUDA be better?
Disabling some tests to restore periodic
nogpu avx512 timeout:
https://hud.pytorch.org/pytorch/pytorch/commit/59f14d19aea4091c65cca2417c509e3dbf60c0ed#38492953496-box
profiler failure: https://hud.pytorch.org/pytorch/pytorch/commit/7ae0ce6360b6e4f944906502d20da24c04debee5#38461255009-box
test_accelerator failure:
https://hud.pytorch.org/pytorch/pytorch/commit/87bfd66c3c7061db6d36d8daa62f08f507f90e39#39476723746-box
origin: 146098
test_overrides failure:
https://hud.pytorch.org/pytorch/pytorch/commit/bf752c36da08871d76a66fd52ad09f87e66fc770#39484562957-box
origin: 146098
inductor cpu repro:
https://hud.pytorch.org/pytorch/pytorch/commit/bb9c4260249ea0c57e87395eff5271fb479efb6a#38447525659-box
functorch eager transforms:
https://hud.pytorch.org/pytorch/pytorch/commit/8f858e226ba81fde41d39aa34f1fd4cb4a4ecc51#39488068620-box
https://hud.pytorch.org/pytorch/pytorch/commit/f2cea01f7195e59abd154b5551213ee3e38fa40d#39555064878
https://hud.pytorch.org/pytorch/pytorch/commit/b5281a4a1806c978e34c5cfa0befd298e469b7fd#39599355600
either 148288 or 148261?
https://hud.pytorch.org/hud/pytorch/pytorch/2ec9aceaeb77176c4bdeb2d008a34cba0cd57e3c/1?per_page=100&name_filter=periodic&mergeLF=true
cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @chenyang78 @kadeng @muchulee8 @amjames @chauhang @aakhundov