python: fix and refactor RPATHs for compiled extensions 2#21149
python: fix and refactor RPATHs for compiled extensions 2#21149scheibelp merged 3 commits intospack:developfrom
Conversation
| # the first ld in the PATH is the Spack wrapper) or somewhere in the | ||
| # middle surrounded with spaces: | ||
| new_ldshared = config_ldshared.replace(" {0} ".format(config_cc), | ||
| " {0} ".format(new_cc)) |
There was a problem hiding this comment.
Could something go wrong if you replace an instance of config_cc that is not surrounded by spaces?
This sets LDSHARED if
- LDSHARED from sysconfigdata begins with CC from sysconfigdata
- CC from sysconfigdata appears in LDSHARED from sysconfigdata
In both of these cases it sets it to be the Spack compiler wrapper (+ extra options from LDSHARED)
#16527 sets LDSHARED if
- LDSHARED from sysconfigdata starts with the underlying compiler chosen by Spack for the dependent
In this case it sets it to be the Spack compiler wrapper (+ extra options from LDSHARED)
A few notes/questions:
- These both seem about as complicated to me.
- In this PR, how will it avoid choosing the Spack compiler wrapper when building with an external Python that used a different compiler?
There was a problem hiding this comment.
@scheibelp I'll try to explain how this is supposed to work with examples. Hopefully, I'll find answers to your questions there.
There are two variables of interest (for now) in the sysconfigdata file: CC and LDSHARED. The first one is used for compilation, the second one for linking. Both variables can be overridden with the respective environmental variables. Also, LDSHARED normally starts with the value of CC (with a few exceptions): https://github.com/python/cpython/blob/3554fa4abecfb77ac5fcaa5ce8310eeca5683960/configure.ac#L2579-L2707 ($(CC) is expanded later when the sysconfigdata file is generated: https://github.com/python/cpython/blob/1459fed92c4bf6bb0d18a4c5637e8f1d8662bed5/Lib/distutils/sysconfig.py#L305). Here are several examples of what the variables can be set to:
- System installation:
build_time_vars = { 'CC': 'gcc -pthread', 'LDSHARED': 'gcc -pthread -shared'}
- Another system installation:
build_time_vars = { 'CC': 'x86_64-linux-gnu-gcc -pthread', 'LDSHARED': 'x86_64-linux-gnu-gcc -pthread -shared -Wl,-O1 ' '-Wl,-Bsymbolic-functions -Wl,-z,relro'}
- Spack installation (after filtering the compiler wrappers):
build_time_vars = { 'CC': '/sw/gcc/gcc-6.3.0/bin/gcc -pthread' 'LDSHARED': '/sw/gcc/gcc-6.3.0/bin/gcc -pthread -shared ' '-L/scratch/local1/spack/opt/spack/linux-debian9-haswell/gcc-6.3.0/gdbm-1.18.1-npx4dgrnybfnuukx3hqcnrvd27pocxgk/lib ' '-L/scratch/local1/spack/opt/spack/linux-debian9-haswell/gcc-6.3.0/expat-2.2.10-yin3ravxvv2cwrlxuhkt33msje27l4xx/lib ' '-L/scratch/local1/spack/opt/spack/linux-debian9-haswell/gcc-6.3.0/bzip2-1.0.8-jwsh3efuidvovhri5xhr6it352vcdeei/lib ' # and so on }
- Manual installation with
./configure CFLAGS=-pthread LDFLAGS=-pthread:build_time_vars = { 'CC': 'gcc', 'LDSHARED': 'gcc -shared -pthread'}
- Another possible set of values to make a point later:
build_time_vars = { 'CC': 'gcc', 'LDSHARED': 'some/prefix/ld_so_aix gcc -bI:/some/prefix/python.exp ' '-L/some/path/with/gcc/substring'}
One thing to notice is that CC often contains an additional flag -pthread. It is needed to build Python but I'm not sure whether the extensions really need it. The thing is that the flag is always dropped when we build an extension with Spack because the latter sets CC to the compiler wrapper, which overrides the value from the sysconfigdata file. We can discuss whether we want to fix this somehow but let's assume that we don't for now.
We need to know the value of CC that will be set by Spack when building an extension. One way to do it is to adjust this code to the context of setup_dependent_build_environment:
new_cc = join_path(spack.paths.build_env_path,
dependent_spec.package.compiler.link_paths['cc'])This is what the original value of CC from the sysconfigdata file will be replaced with. Now we need to replace the old value of CC with the new one in the LDSHARED variable. Unfortunatly, we can's simply replace any occurence of the old value, which you can see in the 5th example above: if we replace 'gcc' with 'whatever', the -L flag will be corrupted. Therefore we have to replace ' gcc ' with ' whatever '. It might also be the case that CC is not a substring of LDSHARED. In this case, we do not need to modify the latter and export it.
The good thing about this solution is that it doesn't matter whether we work with an external Python or not (we don't backup the sysconfigdata file when installing Python with Spack). We simply try to make sure that the values of CC and LDSHARED are consistent. Yes, there might be a compiler mismatch but that's the user's choice to run something like spack install -v py-pyyaml%intel ^python%gcc or spack install -v py-pyyaml%[email protected] ^python%[email protected]. We give the user the freedom to do that (unlike in #16527).
A few more things:
- We might need to take care of
CXXandLDCXXSHAREDtoo. Could someone give me an example of a small extension that is supposed to use those variables? - With this solution, we drop
-pthreadnot only when compiling (which we always did) but also when linking extensions. My tests didn't show any problems with that but maybe I'm missing something. Anyway, if we decide to solve this, I think we should also solve this for the compilation. - Just in case. It does not make much sense to test this on MacOS because of this.
There was a problem hiding this comment.
we can't simply replace any occurence of the old value, which you can see in the 5th example above: if we replace 'gcc' with 'whatever', the -L flag will be corrupted. Therefore we have to replace ' gcc ' with ' whatever '.
I see - thanks for the example!
there might be a compiler mismatch but that's the user's choice to run something like spack install -v py-pyyaml%intel ^python%gcc or spack install -v py-pyyaml%[email protected] ^python%[email protected]. We give the user the freedom to do that (unlike in #16527).
OK got it. I wanted to confirm since I thought that was promoted as a good thing to do in $16527. I think we can handle it either way though.
We might need to take care of CXX and LDCXXSHARED too. Could someone give me an example of a small extension that is supposed to use those variables?
I don't think it qualifies as small, but py-numpy contains .cxx files.
Just in case. It does not make much sense to test this on MacOS because of this.
In the code you pointed out, it appears that MacOS respects LDSHARED if it is set in the environment and does not enter if (sys.platform == 'darwin'... if LDSHARED is set, so I think the behavior on MacOS would match what happens on other systems for Spack.
|
(originally I mistakenly posted this in #16527) I tried building |
|
This should now work for C++ extensions too (tested with |
|
(EDIT) Do you agree with the following commit message (if so, then I think this is ready to merge): |
|
@scheibelp yes, I agree with the message |
|
Thanks! |
…ck#21149) Python extensions use CC and LDSHARED from the sysconfig module to build. When Spack installs Python, it replaces the Spack compiler wrappers in these values with the underlying compilers (since these wrappers are not useful outside of the context of running Spack). In order to use the Spack compiler wrappers when building Python extensions with Spack, Spack sets the LDSHARED environment variable when running `Python.setup_py` (which overrides sysconfig). However, many Python extensions use an alternative method to build (namely PythonPackage.setup_py), which meant that LDSHARED was not set (and RPATHs were not inserted for dependencies). This commit makes the following changes: * Sets LDSHARED in the environment: this applies to all commands executed during the build, rather than for a single command invocation * Updates the logic to set LDSHARED: this replaces the compiler executable in LDSHARED with the Spack compiler wrapper. This means that for some externally-built instances of Python, Spack will now switch to using the Spack wrappers when building extensions. The behavior is expected to be the same for Spack- built instances of Python. * Performs similar modifications for LDCXXSHARED (to ensure RPATHs are included for C++ codes)
I think that the solution in #16527 is rather complex and still does not cover all possible cases. This PR offers a more simple solution with the same practical value.
Closes #16527
Closes #20765