Binary caching bugfix: symlink relocation#10073
Conversation
scheibelp
left a comment
There was a problem hiding this comment.
I have some concerns:
- I think this should be careful to only update symlinks that are absolute
- I think there are potential issues with relocating the symlinks after unpacking
- IMO this is missing some testing
The first two choices are not wrong but I bring up a few spots where there may be complications that result (and feed the third point).
Let me know if these points are unclear or disagreeable.
2dba489 to
51f9f79
Compare
51f9f79 to
d91a8f7
Compare
scheibelp
left a comment
There was a problem hiding this comment.
I have a few additional concerns
| script.write(spec.prefix) | ||
|
|
||
| # Create an absolute symlink | ||
| linkname = os.path.join(spec.prefix, "link_to_dummy.txt") |
There was a problem hiding this comment.
I'm concerned this test isn't fully exercising the possible failures related to symlink relocation because the binary cache is installed in the same location where it was created from. Ideally the test (or a new test) would choose a new location to install to. Ideally the test would also confirm that the symlinks point to where you expect (right now it just does the binary package and install).
I think you may be able to avoid some of the setup that occurs here by just checking binary_distribution.build_tarball.
There was a problem hiding this comment.
Because the symlinks are changed into the workdir by install_tree, it's not possible to have the sort of hidden bug related to the final destination not changing that you're worried about here.
If any of the symlinks are still pointing to the workdir after one install, the subsequent create command will fail. This test found a bug when I added linking to it through exactly that failure mode.
There was a problem hiding this comment.
I think your approach is sufficient for catching a number of bugs. Depending on the implementation though (or how it changes with future edits), I think it's possible that there are cases where unpacking into the same directory structure would succeed but unpacking into a different directory structure will fail. The test ensures that regardless of the implementation, the end desired result (successful relocation into a different root) succeeds.
Some system paths may be legitimate unrelocatable absolute symlinks Users can ignore the warning for such (admittedly rare) cases
|
@scheibelp I've addressed all review comments. |
scheibelp
left a comment
There was a problem hiding this comment.
I have some requests concerning documentation. In one case though there are path manipulations handled as string manipulations which I think needs some additional care.
| @@ -127,7 +128,18 @@ def write_buildinfo_file(prefix, workdir, rel=False): | |||
| # Check if the file contains a string with the installroot. | |||
| # This cuts down on the number of files added to the list | |||
| # of files potentially needing relocation | |||
There was a problem hiding this comment.
The comment above needs an update -- you could work it into the branches of the if for more clarity
| relocate.make_binary_relative(cur_path_names, orig_path_names, | ||
| old_path, allow_root) | ||
| orig_path_names = list() | ||
| cur_path_names = list() |
There was a problem hiding this comment.
This is minor and bikesheddy but I really really prefer [] for this. I use list() when I want to create a list from a sequence but [] for new empty list.
There was a problem hiding this comment.
I have the same preference, but I kept it this way to mirror what we already have elsewhere in this function. LMK if you prefer I change it.
| from spack.relocate import needs_binary_relocation, needs_text_relocation | ||
| from spack.relocate import strings_contains_installroot | ||
| from spack.relocate import get_patchelf, relocate_text | ||
| from spack.relocate import get_patchelf, relocate_text, relocate_links |
There was a problem hiding this comment.
this is minor, but when there are this many imports, I would just import spack.relocate as rel or something like that, and prefix the usages in the code with rel.
There was a problem hiding this comment.
I generally prefer to import them directly, but only for flake line-length reasons. I can double check whether we'll run into line-length problems if I change this.
scheibelp
left a comment
There was a problem hiding this comment.
A few requests: one for documentation; I think symlinks are recorded for relocation even when they are relative-ized; and also I think usage of realpath may have some corner cases that don't need to be resolved here but could benefit from a TODO.
| dirs[:] = [d for d in dirs if d not in blacklist] | ||
| for filename in files: | ||
| path_name = os.path.join(root, filename) | ||
| if os.path.islink(path_name): |
There was a problem hiding this comment.
Should this only be done if rel == False? Otherwise when rel == True, all the symlinks will be relativized when the tarball is built (see make_package_relative in build_tarball) so they won't need to be relocated.
There was a problem hiding this comment.
This is how make_package_relative knows which links to make relative. The same thing is done with RPATHS that are made relative.
There was a problem hiding this comment.
Ah ok it was throwing me off that they were stored in the binary cache since the relative-izing was handled before creating it. However, there is a guard relocate_package which only uses the 'relocate_links' entry if rel == True. Even though it is entirely handled before creating the cache, it could still be worthwhile to keep 'relocate_links' around to indicate what was changed.
|
Regarding #10073 (comment), I think that is important but can be handled in a later PR. |
|
Thanks! |
From #9747:
#9747 caused other binary relocation bugs and was reverted for v0.12.0. This is an alternative solution that avoids the bugs from #9747.
Instead of copying files for absolute symlinks, we relocate the source path of the symlink. This involves computing the new path, removing the link, and creating a new link with the new path.
@scottwittenburg @gartung @scheibelp @tgamblin