-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[ROCm] Skip pointwise associative scan tests due to regression #135995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/135995
Note: Links to docs will display an error until the docs builds have been completed. ✅ You can merge normally! (1 Unrelated Failure)As of commit fa7b212 with merge base 7ed0563 ( FLAKY - The following job failed but was likely due to flakiness present on trunk:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
@malfet Another PR to skip unit tests and get ROCm CI signal to green while we investigate fix in parallel. |
|
@pytorchbot merge -f "Skipping these tests gets rocm workflow signal to green. Discussions ongoing on proper fix in parallel" |
Merge startedYour change will be merged immediately since you used the force (-f) flag, bypassing any CI checks (ETA: 1-5 minutes). Please use Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…ch#135995) pytorch#133012 caused a regression on ROCm causing pointwise scan tests to fail ``` ERROR: test_pointwise_associative_scan_tuple_reverse_True_combine_mode_pointwise_cuda ERROR: test_pointwise_associative_scan_tuple_reverse_False_combine_mode_pointwise_cuda ERROR: test_pointwise_associative_scan_complex_pytree_reverse_True_combine_mode_pointwise_cuda ERROR: test_pointwise_associative_scan_complex_pytree_reverse_False_combine_mode_pointwise_cuda ERROR: test_pointwise_associative_scan_binary_operator_reverse_True_combine_mode_pointwise_cuda ERROR: test_pointwise_associative_scan_binary_operator_reverse_False_combine_mode_pointwise_cuda ``` Skipping temporarily while triage is underway. Full log: https://ossci-raw-job-status.s3.amazonaws.com/log/30067645445 ``` File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/_inductor/graph.py", line 1020, in call_function out = lowerings[target](*args, **kwargs) # type: ignore[index] File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/_inductor/lowering.py", line 363, in wrapped out = decomp_fn(*args, **kwargs) File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/_inductor/lowering.py", line 6245, in associative_scan raise RuntimeError("Unable to generate code for associative_scan op") torch._inductor.exc.LoweringException: RuntimeError: Unable to generate code for associative_scan op ``` NOTE: even "eager" backend fails ``` File "/opt/conda/envs/py_3.8/lib/python3.8/site-packages/torch/_higher_order_ops/associative_scan.py", line 338, in associative_scan_op_dense raise NotImplementedError("associative_scan is not implemented for eager") NotImplementedError: associative_scan is not implemented for eager ``` Pull Request resolved: pytorch#135995 Approved by: https://github.com/malfet
|
@pytorchbot cherry-pick --onto release/2.5 -c critical |
Cherry picking #135995Command Details for Dev Infra teamRaised by workflow job |
|
The cherry-pick PR is at #136557 |
* [ROCm] skip test_fp8_cast_and_t on non-MI300 machines (#135917) Fixes #ISSUE_NUMBER Pull Request resolved: #135917 Approved by: https://github.com/malfet (cherry picked from commit 6cdc70b) * Skip pointwise associative scan tests due to regression (changes based on PR #135995) * Cherry-pick fix from #135702 --------- Co-authored-by: Prachi Gupta <[email protected]> Co-authored-by: Jithun Nair <[email protected]>
#133012 caused a regression on ROCm causing pointwise scan tests to fail
Skipping temporarily while triage is underway.
Full log: https://ossci-raw-job-status.s3.amazonaws.com/log/30067645445
NOTE: even "eager" backend fails
cc @jeffdaily @sunway513 @jithunnair-amd @pruthvistony @ROCmSupport @dllehr-amd @hongxiayang