Skip to content

Conversation

@Yongqi-Zhuo
Copy link
Contributor

There are several problems with the current PyTorch CodeGen.
First, the emitted .pytorch.h file incorrectly exposes functions with internal linkage (which is inconsistent with the C codegen), and leads to compile errors when actually compiling the header file. I added a check for internal linkage to fix this.
Next is the problems when using CUDA. Currently when using the generated .pytorch.h file, the linker will complain that there is undefined reference to a symbol related to halide_cuda_device_interface (the name is mangled when reported by the linker). This is because the forward declaration was not in a extern "C" block.
Moreover, there is a really subtle problem when the generated pipeline is invoked by PyTorch (which I actually have described in Matrix). The Halide pipeline runs in a different CUDA stream than that of PyTorch kernels. Afterwards I discovered that HalidePyTorchCudaHelpers.h was not included as apps/HelloPyTorch/setup.py did, and weakly-linked CUDA handles are not overridden. I fix this by simply including this header in the codegen.

@steven-johnson
Copy link
Contributor

Thanks -- looks like we haven't been testing this code on buildbots recently, added #7448 in hopes someone will address this.

@steven-johnson steven-johnson self-requested a review March 27, 2023 17:00
@steven-johnson steven-johnson merged commit 7976d05 into halide:main Mar 27, 2023
@steven-johnson steven-johnson added the release_notes For changes that may warrant a note in README for official releases. label Mar 27, 2023
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release_notes For changes that may warrant a note in README for official releases.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants