-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
🐛 Bug
Problem occurs when building multiple extension, and giving the same dictionnary for extra_compile_args
To Reproduce
Very easy to reproduce with https://github.com/pytorch/extension-cpp/tree/master/cuda
change the setup.py to have a second fake extension and add extra_compile_args.
this version of setup.py won't work, but the second proposed version will:
args = {'nvcc':[], 'cxx':[]}
setup(
name='lltm_cuda',
ext_modules=[
CUDAExtension('lltm_cuda', [
'lltm_cuda.cpp',
'lltm_cuda_kernel.cu',
], extra_compile_args=args),
CUDAExtension('lltm_cuda2', [
'lltm_cuda.cpp',
'lltm_cuda_kernel.cu',
], extra_compile_args=args),
],
cmdclass={
'build_ext': BuildExtension
})This outputs (although it compiles)
<command-line>:0:0: warning: "TORCH_EXTENSION_NAME" redefined
<command-line>:0:0: note: this is the location of the previous definition
and importing first module gets the error
In [1]: import lltm_cuda
---------------------------------------------------------------------------
ImportError Traceback (most recent call last)
<ipython-input-1-15b747285754> in <module>
----> 1 import lltm_cuda
ImportError: dynamic module does not define module export function (PyInit_lltm_cuda)
args = {'nvcc':[], 'cxx':[]}
setup(
name='lltm_cuda',
ext_modules=[
CUDAExtension('lltm_cuda', [
'lltm_cuda.cpp',
'lltm_cuda_kernel.cu',
], extra_compile_args=args),
CUDAExtension('lltm_cuda2', [
'lltm_cuda.cpp',
'lltm_cuda_kernel.cu',
], extra_compile_args=copy.deepcopy(args)),
],
cmdclass={
'build_ext': BuildExtension
})This works fine and both modules can be imported.
Expected behavior
Same behaviour between the two versions of the install script.
Additional context
The problem comes from the fact that a simple shallow copy of dictionnary is not enough, since compile args are nested into two lists, one for C++ and one for cuda.
Recommended actions should be either change this line
https://github.com/pytorch/pytorch/blob/master/torch/utils/cpp_extension.py#L373
to make a copy.deepcopy instead of a simple copy.copy, or change this line https://github.com/pytorch/pytorch/blob/master/torch/utils/cpp_extension.py#L376 to make a second copy.copy .
First solution is less verbose but might be less secure, since we can't control the depth of the copy. I'm personally more inclined to first solution as I don't see a way to unintentionally copy too much with a simple extra_compile_args dictionnary.