-
Notifications
You must be signed in to change notification settings - Fork 26.3k
[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. #136742
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/136742
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 78b994e with merge base 2ede4c9 ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
torchgen/gen_aoti_c_shim.py
Outdated
| func_group_mapping: dict[OperatorName, NativeFunctionsGroup], | ||
| dispatch_key: DispatchKey, | ||
| backend_indices: dict[DispatchKey, BackendIndex], | ||
| aoti_extend: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refine the variable name a little bit.
torchgen/gen.py
Outdated
| parser.add_argument( | ||
| "--aoti-extend", | ||
| action="store_true", | ||
| help="Update AOTInductor C shim after adding an entry to inductor_fallback_ops in torchgen/aoti/fallback_ops.py. " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the help message.
torchgen/gen_aoti_c_shim.py
Outdated
| else: | ||
| # for the extend out-of-tree kernels, we don't need to | ||
| # duplicatly create C shim wrappers for other dispatch keys | ||
| if aoti_extend: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if aoti_extend: | |
| if aoti_extend and diapatch_key == "XPU": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag aoti_extend, which now is refined as extend_aoti_c_shim, is by design used for extending the c shim of an existing in-tree/partially-in-tree backend, eg, XPU, those totally out-of-tree backend can generate their c shim normally without this flag. So here we don't need to specify if aoti_extend and diapatch_key == "XPU"
|
@desertfire should handle this review |
| ) | ||
|
|
||
| set(GENERATED_CXX_TORCH_XPU | ||
| "${TORCH_SRC_DIR}/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing extend?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NVM: after reading a few more lines, I realized you actually have two parts of c_shim_xpu,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@desertfire yes, as I descripted in the PR description above, there are
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h
torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp
The extend part is generated from out-of-tree torch-xpu-ops, which call python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim --aoti-insall-dir=torch/include/torch/csrc/inductor/aoti_torch/generated/extend/, and the generated extend/c_shim_xpu.cpp is built into libtorch_xpu_ops.a. So extend/c_shim_xpu.cpp is added in cmake file of torch-xpu-ops, not in pytorch cmake file.
|
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742) [AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU. ### Motivation Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced. ### Design To extend the c shim with more OP for a backend from out-of-tree. The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree. The generated c shim is stored in the `extend` subdirectory , for example: ``` torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp ``` example usage: `python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim ` `--xpu`: generate c shim for XPU `--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`) extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`) `--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim. Pull Request resolved: pytorch#136742 Approved by: https://github.com/EikanWang, https://github.com/desertfire ghstack dependencies: pytorch#139025
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742) [AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU. ### Motivation Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced. ### Design To extend the c shim with more OP for a backend from out-of-tree. The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree. The generated c shim is stored in the `extend` subdirectory , for example: ``` torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp ``` example usage: `python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim ` `--xpu`: generate c shim for XPU `--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`) extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`) `--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim. Pull Request resolved: pytorch#136742 Approved by: https://github.com/EikanWang, https://github.com/desertfire ghstack dependencies: pytorch#139025
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742) [AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU. ### Motivation Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced. ### Design To extend the c shim with more OP for a backend from out-of-tree. The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree. The generated c shim is stored in the `extend` subdirectory , for example: ``` torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp ``` example usage: `python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim ` `--xpu`: generate c shim for XPU `--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`) extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`) `--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim. Pull Request resolved: pytorch#136742 Approved by: https://github.com/EikanWang, https://github.com/desertfire ghstack dependencies: pytorch#139025
… make it easy to produce c shims for out-of-tree OP kernels as well. Add c_shim for XPU. (pytorch#136742) [AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU. ### Motivation Since the current c shim codegen will only produce C wrappers for Op's registered in `aten/src/ATen/native/native_functions.yaml`, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example, `third_party/torch-xpu-ops/yaml/native_functions.yaml` , in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced. ### Design To extend the c shim with more OP for a backend from out-of-tree. The PR provided a bool option `--aoti-extend` to indicate the codegen is to extend c shim from out-of-tree. The generated c shim is stored in the `extend` subdirectory , for example: ``` torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/c_shim_xpu.cpp torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.h torch/include/torch/csrc/inductor/aoti_torch/generated/extend/c_shim_xpu.cpp ``` example usage: `python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim ` `--xpu`: generate c shim for XPU `--aoti-extend `: this is an out-of-tree OPs(defined in `third_party/torch-xpu-ops/yaml/native_functions.yaml`) extend for in-tree ops(defined in `aten/src/ATen/native/native_functions.yaml`) `--update-aoti-c-shim`: always generate c_shim_xpu.h for the extend c_shim. Pull Request resolved: pytorch#136742 Approved by: https://github.com/EikanWang, https://github.com/desertfire ghstack dependencies: pytorch#139025
Stack from ghstack (oldest at bottom):
[AOTI] Introduce an extensibility mechanism for the c shim codegen to make it easy to produce c shims for out-of-tree OP kernels as well. Add c shim for XPU.
Motivation
Since the current c shim codegen will only produce C wrappers for Op's registered in
aten/src/ATen/native/native_functions.yaml, for the same backend, when a portion of out-of-tree OP's are not registered in that file, but are registered externally. For example,third_party/torch-xpu-ops/yaml/native_functions.yaml, in this case, the existing codegen can't fulfill the need to do extensions for the c shims from the out-of-tree OPs for the in-tree that has already been produced.Design
To extend the c shim with more OP for a backend from out-of-tree.
The PR provided a bool option
--aoti-extendto indicate the codegen is to extend c shim from out-of-tree.The generated c shim is stored in the
extendsubdirectory , for example:example usage:
python -m torchgen.gen --source-path third_party/torch-xpu-ops/yaml/ --xpu --aoti-extend --update-aoti-c-shim--xpu: generate c shim for XPU--aoti-extend: this is an out-of-tree OPs(defined inthird_party/torch-xpu-ops/yaml/native_functions.yaml) extend for in-tree ops(defined inaten/src/ATen/native/native_functions.yaml)--update-aoti-c-shim: always generate c_shim_xpu.h for the extend c_shim.cc @voznesenskym @penguinwu @EikanWang @jgong5 @Guobing-Chen @XiaobingSuper @zhuhaozhe @blzheng @wenzhe-nrv @jiayisunx @ipiszy @yf225 @chenyang78 @kadeng @muchulee8 @ColinPeppler @amjames @desertfire @chauhang @aakhundov