Skip to content

Conversation

@etaf
Copy link
Collaborator

@etaf etaf commented Sep 26, 2024

Stack from ghstack (oldest at bottom):

[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU.

Motivation

Since the current c shim codegen will only produce C wrappers for Op's registered in aten/src/ATen/native/native_functions.yaml, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, third_party/torch-xpu-ops/yaml/native_functions.yaml , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced.

Design

To extend the c shim with more OP for a backend from out-of-tree.
The PR provided a bool option --aoti-extend to indicate the codegen is to extend c shim from out-of-tree.
The generated c shim is stored in the extend subdirectory , for example:

torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp

example usage:
python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim
--xpu: generate c shim for XPU
--aoti-extend : this is an out-of-tree OPs(defined in third_party/torch-xpu-ops/yaml/native_functions.yaml) extend for in-tree ops(defined in aten/src/ATen/native/native_functions.yaml)
--update-aoti-c-shim: always generate c_shim_xpu.h for the extend c_shim.

cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 26, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136742

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 78b994e with merge base 2ede4c9 (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

etaf added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 6e0667a
Pull Request resolved: #136742
@etaf etaf marked this pull request as draft September 26, 2024 07:26
[ghstack-poisoned]
etaf added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 76a13be
Pull Request resolved: #136742
[ghstack-poisoned]
etaf added a commit that referenced this pull request Sep 26, 2024
ghstack-source-id: 4bf1ea9
Pull Request resolved: #136742
[ghstack-poisoned]
etaf added a commit that referenced this pull request Sep 27, 2024
ghstack-source-id: 935caca
Pull Request resolved: #136742
@etaf etaf added the topic: not user facing topic category label Sep 27, 2024
[ghstack-poisoned]
etaf added a commit that referenced this pull request Sep 27, 2024
ghstack-source-id: 48dd63b
Pull Request resolved: #136742
[ghstack-poisoned]
etaf added a commit that referenced this pull request Sep 28, 2024
ghstack-source-id: 900e9a0
Pull Request resolved: #136742
[ghstack-poisoned]
etaf added a commit that referenced this pull request Oct 1, 2024
ghstack-source-id: 8eb7278
Pull Request resolved: #136742
[ghstack-poisoned]
@etaf etaf added the ciflow/xpu Run XPU CI tasks label Oct 8, 2024
etaf added 2 commits October 8, 2024 13:40
[ghstack-poisoned]
[ghstack-poisoned]
etaf added a commit that referenced this pull request Oct 9, 2024
ghstack-source-id: e10ddd4
Pull Request resolved: #136742
[ghstack-poisoned]
etaf added a commit that referenced this pull request Oct 9, 2024
ghstack-source-id: 2dc831f
Pull Request resolved: #136742
func_group_mapping: dict[OperatorName, NativeFunctionsGroup],
dispatch_key: DispatchKey,
backend_indices: dict[DispatchKey, BackendIndex],
aoti_extend: bool,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refine the variable name a little bit.

torchgen/gen.py Outdated
parser.add_argument(
"--aoti-extend",
action="store_true",
help="Update AOTInductor C shim after adding an entry to inductor_fallback_ops in torchgen/aoti/fallback_ops.py. "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the help message.

else:
# for the extend out-of-tree kernels, we don't need to
# duplicatly create C shim wrappers for other dispatch keys
if aoti_extend:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if aoti_extend:
if aoti_extend and diapatch_key == "XPU":

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag aoti_extend, which now is refined as extend_aoti_c_shim, is by design used for extending the c shim of an existing in-tree/partially-in-tree backend, eg, XPU, those totally out-of-tree backend can generate their c shim normally without this flag. So here we don't need to specify if aoti_extend and diapatch_key == "XPU"

etaf added 7 commits October 30, 2024 03:43
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@jansel
Copy link
Contributor

jansel commented Nov 5, 2024

@desertfire should handle this review

@jansel jansel removed their request for review November 5, 2024 19:22
)

set(GENERATED_CXX_TORCH_XPU
"${TORCH_SRC_DIR}/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing extend?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM: after reading a few more lines, I realized you actually have two parts of c_shim_xpu,

Copy link
Collaborator Author

@etaf etaf Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@desertfire yes, as I descripted in the PR description above, there are

torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp

The extend part is generated from out-of-tree torch-xpu-ops, which call python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim --aoti-insall-dir=torch/include/torch/csrc/inductor/aoti_torch/generated/extend/, and the generated extend/c_shim_xpu.cpp is built into libtorch_xpu_ops.a. So extend/c_shim_xpu.cpp is added in cmake file of torch-xpu-ops, not in pytorch cmake file.

etaf added 4 commits November 6, 2024 22:48
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@etaf
Copy link
Collaborator Author

etaf commented Nov 9, 2024

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging
Check the merge workflow status
here

atalman pushed a commit to atalman/pytorch that referenced this pull request Nov 11, 2024
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742)

[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU.

### Motivation
Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced.

### Design
To extend the c shim with more OP for a backend from out-of-tree.
The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree.
The generated c shim is stored in the `extend` subdirectory , for example:
```
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp
```
example usage:
`python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim  `
`--xpu`:  generate c shim for XPU
`--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`)  extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`)
`--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim.

Pull Request resolved: pytorch#136742
Approved by: https://github.com/EikanWang, https://github.com/desertfire
ghstack dependencies: pytorch#139025
Ryo-not-rio pushed a commit to Ryo-not-rio/pytorch that referenced this pull request Dec 2, 2024
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742)

[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU.

### Motivation
Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced.

### Design
To extend the c shim with more OP for a backend from out-of-tree.
The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree.
The generated c shim is stored in the `extend` subdirectory , for example:
```
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp
```
example usage:
`python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim  `
`--xpu`:  generate c shim for XPU
`--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`)  extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`)
`--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim.

Pull Request resolved: pytorch#136742
Approved by: https://github.com/EikanWang, https://github.com/desertfire
ghstack dependencies: pytorch#139025
pobin6 pushed a commit to pobin6/pytorch that referenced this pull request Dec 5, 2024
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742)

[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU.

### Motivation
Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced.

### Design
To extend the c shim with more OP for a backend from out-of-tree.
The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree.
The generated c shim is stored in the `extend` subdirectory , for example:
```
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp
```
example usage:
`python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim  `
`--xpu`:  generate c shim for XPU
`--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`)  extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`)
`--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim.

Pull Request resolved: pytorch#136742
Approved by: https://github.com/EikanWang, https://github.com/desertfire
ghstack dependencies: pytorch#139025
@github-actions github-actions bot deleted the gh/etaf/47/head branch December 10, 2024 02:12
fmo-mt pushed a commit to fmo-mt/pytorch that referenced this pull request Dec 11, 2024
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742)

[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU.

### Motivation
Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced.

### Design
To extend the c shim with more OP for a backend from out-of-tree.
The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree.
The generated c shim is stored in the `extend` subdirectory , for example:
```
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp
```
example usage:
`python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim  `
`--xpu`:  generate c shim for XPU
`--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`)  extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`)
`--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim.

Pull Request resolved: pytorch#136742
Approved by: https://github.com/EikanWang, https://github.com/desertfire
ghstack dependencies: pytorch#139025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/inductor ciflow/trunk Trigger trunk jobs on your pull request ciflow/xpu Run XPU CI tasks Merged module: inductor open source suppress-bc-linter Suppresses the failures of API backward-compatibility linter (Lint/bc_linter) topic: not user facing topic category

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

8 participants