Skip to content

Binary caching bugfix: symlink relocation#10073

Merged
scheibelp merged 17 commits intodevelopfrom
bugfix/symlink-relocation
Jan 11, 2019
Merged

Binary caching bugfix: symlink relocation#10073
scheibelp merged 17 commits intodevelopfrom
bugfix/symlink-relocation

Conversation

@becker33
Copy link
Copy Markdown
Member

From #9747:

Binary caches of packages with absolute symlinks had broken symlinks. From what I can tell, tar doesn't support any notion of matching source/destination roots when unpacking an archive with absolute symlinks. Therefore, this commit just makes a copy of any file that is a symlink while creating a binary cache of a package.

#9747 caused other binary relocation bugs and was reverted for v0.12.0. This is an alternative solution that avoids the bugs from #9747.

Instead of copying files for absolute symlinks, we relocate the source path of the symlink. This involves computing the new path, removing the link, and creating a new link with the new path.

@scottwittenburg @gartung @scheibelp @tgamblin

Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns:

  • I think this should be careful to only update symlinks that are absolute
  • I think there are potential issues with relocating the symlinks after unpacking
  • IMO this is missing some testing

The first two choices are not wrong but I bring up a few spots where there may be complications that result (and feed the third point).

Let me know if these points are unclear or disagreeable.

@becker33 becker33 force-pushed the bugfix/symlink-relocation branch from 2dba489 to 51f9f79 Compare December 12, 2018 19:11
@becker33 becker33 force-pushed the bugfix/symlink-relocation branch from 51f9f79 to d91a8f7 Compare December 12, 2018 19:27
Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few additional concerns

script.write(spec.prefix)

# Create an absolute symlink
linkname = os.path.join(spec.prefix, "link_to_dummy.txt")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned this test isn't fully exercising the possible failures related to symlink relocation because the binary cache is installed in the same location where it was created from. Ideally the test (or a new test) would choose a new location to install to. Ideally the test would also confirm that the symlinks point to where you expect (right now it just does the binary package and install).

I think you may be able to avoid some of the setup that occurs here by just checking binary_distribution.build_tarball.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the symlinks are changed into the workdir by install_tree, it's not possible to have the sort of hidden bug related to the final destination not changing that you're worried about here.

If any of the symlinks are still pointing to the workdir after one install, the subsequent create command will fail. This test found a bug when I added linking to it through exactly that failure mode.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your approach is sufficient for catching a number of bugs. Depending on the implementation though (or how it changes with future edits), I think it's possible that there are cases where unpacking into the same directory structure would succeed but unpacking into a different directory structure will fail. The test ensures that regardless of the implementation, the end desired result (successful relocation into a different root) succeeds.

@gartung
Copy link
Copy Markdown
Member

gartung commented Jan 9, 2019

@becker33 @tgamblin Can this be merged? I touch the same files and added prefix to the make_package_relative arguments. I would like to resolve any merge conflicts.

@becker33
Copy link
Copy Markdown
Member Author

becker33 commented Jan 9, 2019

@scheibelp I've addressed all review comments.

@becker33 becker33 removed the request for review from tgamblin January 9, 2019 22:57
Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some requests concerning documentation. In one case though there are path manipulations handled as string manipulations which I think needs some additional care.

Copy link
Copy Markdown
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @becker33: I had some minor comments but this PR LGTM!

@@ -127,7 +128,18 @@ def write_buildinfo_file(prefix, workdir, rel=False):
# Check if the file contains a string with the installroot.
# This cuts down on the number of files added to the list
# of files potentially needing relocation
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above needs an update -- you could work it into the branches of the if for more clarity

relocate.make_binary_relative(cur_path_names, orig_path_names,
old_path, allow_root)
orig_path_names = list()
cur_path_names = list()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is minor and bikesheddy but I really really prefer [] for this. I use list() when I want to create a list from a sequence but [] for new empty list.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same preference, but I kept it this way to mirror what we already have elsewhere in this function. LMK if you prefer I change it.

from spack.relocate import needs_binary_relocation, needs_text_relocation
from spack.relocate import strings_contains_installroot
from spack.relocate import get_patchelf, relocate_text
from spack.relocate import get_patchelf, relocate_text, relocate_links
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is minor, but when there are this many imports, I would just import spack.relocate as rel or something like that, and prefix the usages in the code with rel.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I generally prefer to import them directly, but only for flake line-length reasons. I can double check whether we'll run into line-length problems if I change this.

Copy link
Copy Markdown
Member

@scheibelp scheibelp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few requests: one for documentation; I think symlinks are recorded for relocation even when they are relative-ized; and also I think usage of realpath may have some corner cases that don't need to be resolved here but could benefit from a TODO.

dirs[:] = [d for d in dirs if d not in blacklist]
for filename in files:
path_name = os.path.join(root, filename)
if os.path.islink(path_name):
Copy link
Copy Markdown
Member

@scheibelp scheibelp Jan 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this only be done if rel == False? Otherwise when rel == True, all the symlinks will be relativized when the tarball is built (see make_package_relative in build_tarball) so they won't need to be relocated.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how make_package_relative knows which links to make relative. The same thing is done with RPATHS that are made relative.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok it was throwing me off that they were stored in the binary cache since the relative-izing was handled before creating it. However, there is a guard relocate_package which only uses the 'relocate_links' entry if rel == True. Even though it is entirely handled before creating the cache, it could still be worthwhile to keep 'relocate_links' around to indicate what was changed.

@scheibelp
Copy link
Copy Markdown
Member

Regarding #10073 (comment), I think that is important but can be handled in a later PR.

@scheibelp scheibelp merged commit c63c4a0 into develop Jan 11, 2019
@scheibelp
Copy link
Copy Markdown
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants