Skip to content

Conversation

@fengyuan14
Copy link
Collaborator

@fengyuan14 fengyuan14 commented Feb 29, 2024

As a follow-up to #114835 and #119682, we add limited ATen operators implementation for XPU. With this PR, the blocking issue for oneDNN operations and Inductor XPU backend will be resolved as the two components depend on these operations to support its basic features, respectively.

The added ATen operators include:

  • copy_, _to_copy, _copy_from_and_resize, , clone
  • view, view_as_real, view_as_complex,
  • as_strided, _reshape_alias, resize_, resize_as_,
  • add/add_, sub/sub_, mul/mul_, div/div_, abs,
  • empty, empty_strided,
  • fill_, zeros_.

@pytorch-bot
Copy link

pytorch-bot bot commented Feb 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/120891

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (5 Unrelated Failures)

As of commit 23aff01 with merge base c4486d3 (image):

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Copy link
Collaborator

@EikanWang EikanWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test cases for XPU ops.

@EikanWang EikanWang added the ciflow/xpu Run XPU CI tasks label Feb 29, 2024
@fengyuan14 fengyuan14 marked this pull request as ready for review March 8, 2024 12:09
@fengyuan14 fengyuan14 requested a review from a team as a code owner March 8, 2024 12:09
@EikanWang EikanWang requested review from albanD, atalman and malfet March 10, 2024 01:01
Copy link
Contributor

@malfet malfet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add additional submodules, but figure out some sort of a soft checkout mechanism

Also, this repository feels like a huge copy-n-paste of an already existing code (i.e. src/aten/sycl/Loops.h looks very similar to aten/src/ATen/cuda/Loops.cuh, which makes me wonder if some of this code can be better reused between the two somehow)

@fengyuan14
Copy link
Collaborator Author

fengyuan14 commented Mar 12, 2024

Also, this repository feels like a huge copy-n-paste of an already existing code (i.e. src/aten/sycl/Loops.h looks very similar to aten/src/ATen/cuda/Loops.cuh, which makes me wonder if some of this code can be better reused between the two somehow)

Hi, @malfet ,
Reasons of no sharing,

  1. Sharing code will impact other backends. At the first stage, we follow a conservative approach, won't share kernels.
  2. It takes non-trivial efforts to abstract device agnostic parts of kernels. Regarding performance, even we have same concurrency algorithm, HW specific divergence is inevitable, like different heuristic (shapes, broadcast, condition of vectorization, A32/A64 addressing ...), different register pressure (different number of General Programming Register. Small number of GRF causes register spill when using too many intermediate local variables in kernel), different native vectorization capability and so on. We have tired Abstract common helpers from CUDA implementation for code reusing in upcoming SYCL kernels #117234. The PR makes helpers used in Loops general. But it is not enough to help share all things in Loops among backends. It takes some time to make a kernel totally general.

Thanks.

@EikanWang
Copy link
Collaborator

EikanWang commented Mar 12, 2024

Let's not add additional submodules, but figure out some sort of a soft checkout mechanism

@malfet , thanks for your comments. May I know your concerns about the git submodule integration mechanism? Regarding the soft checkout mechanism, we might consider using CMake's ExternalProject or custom target as a solution. This would allow us to clone XPU Aten operations during the build process if USE_XPU is set to ON. Apart from the approaches, do you have any other suggestions or proposals?

@EikanWang EikanWang requested a review from malfet March 13, 2024 08:06
@drisspg drisspg added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Mar 13, 2024
@EikanWang EikanWang requested a review from albanD March 14, 2024 05:35
@fengyuan14 fengyuan14 requested a review from mruberry as a code owner March 15, 2024 12:01
Copy link
Collaborator

@EikanWang EikanWang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please update the test cases a little bit

@atalman
Copy link
Contributor

atalman commented Mar 18, 2024

@pytorchbot rebase

@pytorchmergebot
Copy link
Collaborator

@pytorchbot started a rebase job onto refs/remotes/origin/viable/strict. Check the current status here

@pytorchmergebot
Copy link
Collaborator

Successfully rebased fengyuan/out-of-tree-xpu-ops onto refs/remotes/origin/viable/strict, please pull locally before adding more changes (for example, via git checkout fengyuan/out-of-tree-xpu-ops && git pull --rebase)

@EikanWang EikanWang force-pushed the fengyuan/out-of-tree-xpu-ops branch from 3dea345 to 23aff01 Compare March 21, 2024 23:07
@EikanWang
Copy link
Collaborator

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

@EikanWang
Copy link
Collaborator

@pytorchbot merge -i

pytorch-bot bot pushed a commit that referenced this pull request Apr 22, 2024
As a follow-up to #114835 and #119682, we add limited ATen operators implementation for XPU. With this PR, the blocking issue for oneDNN operations and Inductor XPU backend will be resolved as the two components depend on these operations to support its basic features, respectively.

The added ATen operators include:

- `copy_`, `_to_copy`, `_copy_from_and_resize`, , `clone`
- `view`, `view_as_real`, `view_as_complex`,
- `as_strided`, `_reshape_alias`, `resize_`, `resize_as_`,
- `add`/`add_`, `sub`/`sub_`, `mul`/`mul_`, `div`/`div_`, `abs`,
- `empty`, `empty_strided`,
- `fill_`, `zeros_`.

Co-authored-by: Wang, Eikan <[email protected]>
Pull Request resolved: #120891
Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/gujinghui, https://github.com/atalman
pytorch-bot bot pushed a commit that referenced this pull request Apr 22, 2024
)"

This reverts commit 148a8de.

Reverted #120891 on behalf of https://github.com/huydhn due to Sorry for reverting your change but I need to revert it to resolve a conflict in trunk #121794 (comment).  Please help reland the change after ([comment](#120891 (comment)))
pytorch-bot bot pushed a commit that referenced this pull request Apr 22, 2024
As a follow-up to #114835 and #119682, we add limited ATen operators implementation for XPU. With this PR, the blocking issue for oneDNN operations and Inductor XPU backend will be resolved as the two components depend on these operations to support its basic features, respectively.

The added ATen operators include:

- `copy_`, `_to_copy`, `_copy_from_and_resize`, , `clone`
- `view`, `view_as_real`, `view_as_complex`,
- `as_strided`, `_reshape_alias`, `resize_`, `resize_as_`,
- `add`/`add_`, `sub`/`sub_`, `mul`/`mul_`, `div`/`div_`, `abs`,
- `empty`, `empty_strided`,
- `fill_`, `zeros_`.

Co-authored-by: Wang, Eikan <[email protected]>
Pull Request resolved: #120891
Approved by: https://github.com/EikanWang, https://github.com/jgong5, https://github.com/gujinghui, https://github.com/atalman
@github-actions github-actions bot deleted the fengyuan/out-of-tree-xpu-ops branch April 22, 2024 01:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/rocm Trigger "default" config CI on ROCm ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks Merged open source Reverted topic: not user facing topic category triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.