-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[Intel GPU] Avoid atomic add for XPU device in satter_add by deterministic mode #137966
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Intel GPU] Avoid atomic add for XPU device in satter_add by deterministic mode #137966
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/137966
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 1b26d16 with merge base 565a794 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
|
May I know what's the impact? Does it lead to any case failure? |
|
Please seek CI approval before scheduling CIFlow labels |
|
Please add test cases. |
Yes, the UT covered in https://github.com/intel/torch-xpu-ops/blob/main/test/xpu/test_scatter_gather_ops_xpu.py. |
3d7a311 to
fc26d80
Compare
Signed-off-by: Cheng Penghui <[email protected]>
fc26d80 to
1b26d16
Compare
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
…istic mode (pytorch#137966) The "scatter_add" op with the deterministic mode in XPU device is not implemented, it will report that "scatter_add_kernel" does not have a deterministic implementation in UT. Just like the implementation of CUDA, we need to check _deterministic_algorithms in scatter_add op for the XPU device. The UT is in: https://github.com/intel/torch-xpu-ops/blob/main/test/xpu/test_scatter_gather_ops_xpu.py. We reused [PyTorch UT code]( https://github.com/pytorch/pytorch/blob/96b30dcb25c80513769dae2a8688aec080b00117/test/test_scatter_gather_ops.py#L233). Now the UT case is [skipped in torch-xpu-ops test](https://github.com/intel/torch-xpu-ops/blob/4fa7921f1e9a0bf300d25da9b8758524f2751092/test/xpu/skip_list_common.py#L731). Will open it when this PR is merged. Pull Request resolved: pytorch#137966 Approved by: https://github.com/EikanWang, https://github.com/guangyey, https://github.com/ezyang
The "scatter_add" op with the deterministic mode in XPU device is not implemented, it will report that "scatter_add_kernel" does not have a deterministic implementation in UT.
Just like the implementation of CUDA, we need to check _deterministic_algorithms in scatter_add op for the XPU device.
The UT is in: https://github.com/intel/torch-xpu-ops/blob/main/test/xpu/test_scatter_gather_ops_xpu.py. We reused PyTorch UT code.
Now the UT case is skipped in torch-xpu-ops test. Will open it when this PR is merged.