Fix r- package downloads from list_url by glennpj · Pull Request #29250 · spack/spack

glennpj · 2022-02-28T21:51:52Z

When the cran attribute is set, a url attribute is derived. This was
using the first version found in the array of versions. The expectation
wsa that if that version was not present at the URL that it would fall
back to list_url. That was not happening. Instead, the version in the
derived url was always associated with the url and was not falling back
to list_url. Thus, if the most recent version of a package in spack was
older than what was in CRAN, the download would fail, because the
archive had been moved. Using an older version for the derived url
would work but sometimes there is only one version present in spack
until the package is updated. Instead, use a fake version that would
never exist in the wild so that spack does not lock what it thinks is
the most current version to the url attribute.

Fixes spack#26977 Fixes spack#29204 When the cran attribute is set, a url attribute is derived. This was using the first version found in the array of versions. The expectation wsa that if that version was not present at the URL that it would fall back to list_url. That was not happening. Instead, the version in the derived url was always associated with the url and was not falling back to list_url. Thus, if the most recent version of a package in spack was older than what was in CRAN, the download would fail, because the archive had been moved. Using an older version for the derived url would work but sometimes there is only one version present in spack until the package is updated. Instead, use a fake version that would never exist in the wild so that spack does not lock what it thinks is the most current version to the url attribute.

glennpj · 2022-02-28T21:55:47Z

This change will probably impact url parsing stats.

adamjstewart · 2022-02-28T21:57:23Z

This seems more like a workaround than a solution. If a version isn't found at the specified URL, we should always fall back to list_url. @trws could this be caused by your recent spack checksum changes?

glennpj · 2022-02-28T22:30:35Z

This seems more like a workaround than a solution. If a version isn't found at the specified URL, we should always fall back to list_url

That was the expectation but I can not say for sure that the fallback worked in the past within the window of CRAN having a newer version than what was newest in spack, with the url pointing to it. Generally speaking, the url pattern points to an older version when a new version is added. That was fairly consistent in the r- packages. With the introduction of the cran attribute the url was always getting set to point to the most recent version that spack knew about. Prior to the cran attribute that scenario is not something that would have been tested. Now that the cran attribute has been added to all CRAN packages, the problem will be more prevalent. My goal with this PR is to get r- package downloads working ASAP. I understand that there could be a more systemic problem that needs to be dealt with.

glennpj · 2022-02-28T22:36:07Z

@trws could this be caused by your recent spack checksum changes?

I do not think so because #26977 was prior to that. At the time of that issue there were only a few r- packages using the cran attribute.

trws · 2022-02-28T23:12:53Z

It's possible, I tried to reproduce the original behavior whenever the version generated didn't match anything on the page, but maybe there's a corner case where the archive URL specified is inaccurate?

The behavior I intended, before I go look back at it, was to treat the archive URLs specified by a package as accurate, and not change them based on what's on the page. If the URL specified by the package is wrong, that might be different behavior than before because the old behavior was something like "use the last plausibly matching version found on the page." Let me poke at this for a minute, I didn't test this directly outside of checksum and create.

glennpj · 2022-02-28T23:37:40Z

Doing some more poking, if I set the url attribute to an older version, fetching fails for that version as well. So, it is not an issue of being the newest version in the package file but rather any version that the url is set to can not be fetched as the fallback is broken. I see this with tcsh as well so this is not specific to r-packages.

So, yes, this PR is a workaround to get r- packages working, actually leveraging the cran attribute to fake the version to ensure that the version does not equal what is in the url attribute. I am going to try to gather more information and create an issue specific for the failure to fall back to list_url.

@adamjstewart Does this sound okay?

Get r-packages working for now because there will be more breaking as time goes on, and there are a lot of them.
Have an issue focused on url/list_url, not specific to r-packages, as this is not specific to R.
Revert this PR once things are fixed?

trws · 2022-02-28T23:41:15Z

Ok, I found at least one bug that's causing part of this problem. Have a look here:

spack/lib/spack/spack/cmd/checksum.py

Lines 77 to 85 in 6eef12c

    
           # And ensure the specified version URLs take precedence, if available 
        
           try: 
        
               explicit_dict = {} 
        
               for v in pkg.versions: 
        
                   if not v.isdevelop(): 
        
                       explicit_dict[v] = pkg.url_for_version(v) 
        
               url_dict.update(explicit_dict) 
        
           except spack.package.NoURLError: 
        
               pass

Those lines overwrite the selections from my updates in web, to incorrect values for these packages. For ellipsis, before removing those:

==> [2022-02-28-15:37:02.702619] Found 8 versions of r-ellipsis:

  0.3.2    https://cloud.r-project.org/src/contrib/ellipsis_0.3.2.tar.gz
  0.3.1    https://cloud.r-project.org/src/contrib/ellipsis_0.3.1.tar.gz
  0.3.0    https://cloud.r-project.org/src/contrib/ellipsis_0.3.0.tar.gz
  0.2.0.1  https://cloud.r-project.org/src/contrib/ellipsis_0.2.0.1.tar.gz
  0.2.0    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.2.0.tar.gz
  0.1.0    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.1.0.tar.gz
  0.0.2    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.0.2.tar.gz
  0.0.1    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.0.1.tar.gz

After removing:

  0.3.2    https://cloud.r-project.org/src/contrib/ellipsis_0.3.2.tar.gz
  0.3.1    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.3.1.tar.gz
  0.3.0    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.3.0.tar.gz
  0.2.0.1  https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.2.0.1.tar.gz
  0.2.0    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.2.0.tar.gz
  0.1.0    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.1.0.tar.gz
  0.0.2    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.0.2.tar.gz
  0.0.1    https://cloud.r-project.org/src/contrib/Archive/ellipsis/ellipsis_0.0.1.tar.gz

That still doesn't resolve the specific version issue though, which uses a completely different codepath than what I touched. I don't see anything in here that looks like a fallback though. Still looking.

glennpj · 2022-02-28T23:57:39Z

Here is some more info. I created a branch to the commit just prior to #28989. That gets fetching from list_url working again for r-packages and tcsh.

I also realized that #26977 is actually a different problem, and thus not fixed by this PR workaround.

glennpj · 2022-03-01T00:02:52Z

@trws The bug you found is #25831, which was also discussed a bit in #26977.

trws · 2022-03-01T00:32:44Z

Yeah, I see it. This one is caused by the fact list_url isn't ever actually checked, there's no fallback to it, there's a fallback to a dynamic fetch strategy that's set up to use fetch_remote_versions which uses list_url internally. My changes made the urls provided by the package get prioritized over versions found by web scraping in this one corner case, PR forthcoming for both issues.

glennpj · 2022-03-07T01:14:46Z

Closing in favor of #29258.

spackbot-app bot added the build-systems label Feb 28, 2022

glennpj mentioned this pull request Feb 28, 2022

Fetching not up-to-date versions fails #29204

Closed

3 tasks

trws mentioned this pull request Mar 1, 2022

R versions #29258

Merged

alalazo assigned trws and adamjstewart Mar 4, 2022

glennpj closed this Mar 7, 2022

glennpj deleted the r_downloads branch March 7, 2022 01:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix r- package downloads from list_url#29250

Fix r- package downloads from list_url#29250
glennpj wants to merge 1 commit intospack:developfrom
glennpj:r_downloads

glennpj commented Feb 28, 2022 •

edited

Loading

Uh oh!

glennpj commented Feb 28, 2022

Uh oh!

adamjstewart commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022 •

edited

Loading

Uh oh!

trws commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022

Uh oh!

trws commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022 •

edited

Loading

Uh oh!

glennpj commented Mar 1, 2022

Uh oh!

trws commented Mar 1, 2022

Uh oh!

glennpj commented Mar 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

glennpj commented Feb 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glennpj commented Feb 28, 2022

Uh oh!

adamjstewart commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trws commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022

Uh oh!

trws commented Feb 28, 2022

Uh oh!

glennpj commented Feb 28, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

glennpj commented Mar 1, 2022

Uh oh!

trws commented Mar 1, 2022

Uh oh!

glennpj commented Mar 7, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

glennpj commented Feb 28, 2022 •

edited

Loading

glennpj commented Feb 28, 2022 •

edited

Loading

glennpj commented Feb 28, 2022 •

edited

Loading