Skip to content

Don't format local path when caching a local archive#13408

Merged
tgamblin merged 1 commit intospack:developfrom
scheibelp:bugfix/dont-format-local-path
Oct 23, 2019
Merged

Don't format local path when caching a local archive#13408
tgamblin merged 1 commit intospack:developfrom
scheibelp:bugfix/dont-format-local-path

Conversation

@scheibelp
Copy link
Copy Markdown
Member

@scheibelp scheibelp commented Oct 23, 2019

Fixes #13404

fd58c98 formats the Stage's archive_path in Stage.archive (as part of web.push_to_url). This is not needed and if the formatted differs from the original path (for example if the archive file name contains a URL query suffix), then the copy fails.

This removes the formatting that occurs in web.push_to_url. Admittedly the archive path chosen by the stage should strip off strings like ?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Flibuuid%2F&ts=1433881396&use_mirror=iweb from libuuid-1.0.3.tar.gz?r=http%3A%2F%2Fsourceforge.net%2Fprojects%2Flibuuid%2F&ts=1433881396&use_mirror=iweb, but that should be handled when the archive names are chosen (i.e. in Stage.expected_archive_files).

See also: #11117 (comment)

…: the path as given is already confirmed to exist before the funciton is called
@scheibelp
Copy link
Copy Markdown
Member Author

@opadron @tgamblin

If one of you can review that would be excellent.

@tgamblin
Copy link
Copy Markdown
Member

@odoublewen

@tgamblin
Copy link
Copy Markdown
Member

This fixes #13404 for me. FWIW, the sourceforge URLs like the one listed there can be truncated to not include the query string (from "?" onwards in the URL), and it'll result in a much nicer archive name in the stage directory.

We might want to do that in general, just so that we do not have garbage URLs in Spack, but it is good that we can handle things when people use them.

@tgamblin
Copy link
Copy Markdown
Member

tgamblin commented Oct 23, 2019

@scheibelp: I'm not sure that we can generically strip off the query string in the URL all the time, but we might want to change the fetch strategy to pass -J / --remote-header-name to curl in addition to -O. In at least this case, that would result in a nicer filename after download. I am not sure how well -J will work across all the URLs in Spack but we could look into that later.

@tgamblin tgamblin merged commit f2ddffb into spack:develop Oct 23, 2019
@tgamblin
Copy link
Copy Markdown
Member

Thanks @scheibelp!



def push_to_url(local_path, remote_path, **kwargs):
def push_to_url(local_file_path, remote_path, **kwargs):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand how this change addresses #13404, but it seems to imply that there is something broken with how file:// URLs are parsed. It looks like urllib is stripping out the part of the filename that looks like query arguments, and that will still be a problem if we ever have to work with similar file paths, either as part of something internal to Spack (as in this case), or if a user ever hands us such a file path.

I'd like to take a closer look at this and possibly follow up with @scheibelp if I have any questions.

jrmadsen pushed a commit to jrmadsen/spack that referenced this pull request Oct 30, 2019
…ed to it (spack#13408)

fd58c98 formats the `Stage`'s `archive_path` in `Stage.archive` (as part of `web.push_to_url`). This is not needed and if the formatted differs from the original path (for example if the archive file name contains a URL query suffix), then the copy fails.

This removes the formatting that occurs in `web.push_to_url`.

We should figure out a way to handle bad cases like this *and* to have nicer filenames for downloaded files.  One option that would work in this particular case would be to also pass `-J` / `--remote-header-name` to `curl`.  We'll need to do follow-up work to determine if we can use `-J` everywhere.

See also: spack#11117 (comment)
scottwittenburg added a commit to scottwittenburg/spack that referenced this pull request Oct 31, 2019
scottwittenburg added a commit to scottwittenburg/spack that referenced this pull request Oct 31, 2019
scottwittenburg added a commit to scottwittenburg/spack that referenced this pull request Nov 6, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sourceforge download filenames corrupted after #11117

3 participants