Conversation
|
|
||
|
|
||
| class RBiocgenerics(Package): | ||
| class RBiocgenerics(RPackage): |
There was a problem hiding this comment.
This was the only package that was even slightly different from the rest. I confirmed that the changes I made allow it to still install correctly.
|
The R build system is remarkably uniform. For example, almost every package has a url = "https://cran.r-project.org/src/contrib/Rcpp_0.12.6.tar.gz"
list_url = "https://cran.r-project.org/src/contrib/Archive/Rcpp"Due to the presence of capital letters, I can't simply add: url = "https://cran.r-project.org/src/contrib/{0}_{1}.tar.gz".format(self.name, self.something_that_gets_me_the_latest_version)
list_url = "https://cran.r-project.org/src/contrib/Archive/{0}".format(self.name)to the EDIT: Actually, if I added a |
I totally agree with the changes, almost all the R packages have the same structure except for the dependencies and links to the source files, so in my opinion there is no any risk with the change. In fact, this change could make easier to create new R packages if we combine it properly with what you mentioned in your second comment, given that it reduces the cumbersome part of copy the same code once and again and make the scrapy part manually. |
Regarding to this part, to take the capitalized name from the given URL seems to me a quick and good option. In addition, I think that this process could be even more automatic for new R packages creating a new kind of
In my opinion, I wouldn't remove these links from the package files. I think that it consistently maintains the R package format equal to the rest of Spack packages. In fact, some R packages (like r-irkernel) may have other different source as github. |
alalazo
left a comment
There was a problem hiding this comment.
This avoids a lot of duplicated code. I like it! 😄
|
@JavierCVilla I think all of those are good ideas. @citibeth and I are suggesting similar improvements to the creation of Python packages in #2709, #2718, #2749, and #2750. I currently have a lot of PRs open that affect |
I say this not because I don't think it should be done in this PR but mostly because if a single one of the 173 files I've already changed is edited in the meantime, I'll need to continue to rebase and resolve any conflicts. |
|
I have a proposal brewing, will try to write it up by tonight... |
|
I think this is safe to merge. It affects a lot of packages, so I would like to get it merged sooner rather than later. There are a lot of fancy things we can add to |
|
@glennpj: any comments? This looks good to me. Thanks @adamjstewart! |
|
This looks good to me. My only comment is on the BiocGenerics package. That is a snowflake, as @adamjstewart noted. That is a BioConductor package and versions are tied to bioconductor suite versions, which in turn are tied to R versions. The reason for the bioc- prefix was to indicate the version of bioconductor that the package worked in. It is not the version of biocgenerics, which is different, currently 0.20.0. That said, most people using bioconductor packages do not care about the versions of the individual packages, but rather the version of the bioconductor suite, so Adam's changes for the version should work just fine. I am not as sure about the dependency specs though. Bioconductor is really picky about the R version and the versions are not always in sync, as they appear in the package.py file. For instance, the current version of bioconductor is 3.4 and that works with R-3.3.1. I think specifying R version ranges in the package.py may have spack potentially getting bioconductor packages out of sync with bioconductor "supported" versions of R. I believe biocgenerics is the only bioconductor package in spack at the moment. It is a core bioconductor package. |
|
Thanks for the knowledge @glennpj! My users don't use R very heavily so I admittedly don't have much experience with it.
Hmm, this is quite different from every other package in Spack. Bioconductor is a collection of R packages? Perhaps we could use real versions for each package and create a
All I did was move the constraints from |
citibeth
left a comment
There was a problem hiding this comment.
I realize this is cleaning up an old problem.... but is it possible to get rid of git= download methods and replace them with checksummable URLs? I found three places in this commit where that could be done.
| version('bioc-3.3', | ||
| version('3.3', | ||
| git='https://github.com/Bioconductor-mirror/BiocGenerics.git', | ||
| branch='release-3.3') |
There was a problem hiding this comment.
Can you replace these downloads with a checksummable URL? Something like this:
url = "https://github.com/Biocondutor-mirror/BiocGenerics/tarball/release-3.3"
version('3.3', '...')```
| @@ -35,8 +35,6 @@ class RIrkernel(Package): | |||
| version('master', git='https://github.com/IRkernel/IRkernel.git', | |||
| tag='0.7') | |||
|
|
|||
There was a problem hiding this comment.
Can you replace these downloads with a checksummable URL? Something like this:
url = "https://github.com/IRKernel/IRKernel/tarball/0.7"
version('0.7', '...')```
In general I agree with @adamjstewart here; and certainly, putting it in the individual packages is not the place to go. But I'm left wondering what the "right" approach is? Apparently, Bioconductor is a repository of 2600 packages. We're back to the same problem of dealing with external repos. @JavierCVilla can you expand upon how you are intending to use the
That looks good to me. |
|
I would like to leave any changes to biocgenerics for another PR. It is well outside the scope of this PR, and from what it sounds like the exact version number used is very sensitive. I definitely don't know enough about it to trust myself to make that change. @tgamblin It sounds like @JavierCVilla and @glennpj are ok with all of the changes in this PR. Once this is merged, either one of them can continue tuning the biocgenerics package. |
|
@citibeth: I'm tempted to include the needed package here and let the people who use R and Bioconductor work out the versioning. Including one package from it shouldn't be a huge deal, and if they want to version-lock it, they can |
|
Thanks @glennpj! |
|
I know the bioconductor issues are out of scope for this ticket but I would like to capture one more important item for future reference. The bioconductor project does not keep older versions of individual packages around, only the current version is ever available. They recently starting keeping the versions associated with a particular version of the suite in git with branches. That is the only way to make sure that an installed version of a package can be reinstalled at some later date. For example, whichever version of biocgenerics is used for bioconductor-3.3 will always be available in the 'release-3.3' branch of the bioconductor git mirror. |
|
@glennpj: We're setting up an S3 bucket at LLNL to archive versions of Spack packages. We will probably add it as a mirror in |
GitHub is good at making tarballs on demand. Therefore, the practice described above is equivalent to keeping old tarballs around for re-installation later. If you have your source on GitHub, manually making tarballs is a waste of time (IMHO); you get exactly the same effect (including checksummability) by setting a tag or (non-moving) branch, and forming URLs to access that branch. On my projects, I don't even bother generating tarballs for this reason. That's the principle behind my suggested changes above: If it were me, I'd be happy to create the Spack package to build directly from the GitHub-generated tarballs, and to ignore the "latest" release (which changes all the time anyway). |
I'm not sure what I think of Spack becoming the ONLY source for old tarballs that the authors have, for one reason or another, decided to make disappear. |
|
We don't have to keep them forever, but this is not unlike what Linux distributions do. |
|
Also we don't have to be the only source -- as the project gains ground we may find that others are willing to provide mirrors. |
|
Well... some discretion might be in line here. If a project is just broken
and doesn't keep old tarballs, then I believe we should do so. If a
project specifically removes an old tarball because of a security problem,
then I don't think we should keep it.
…On Sat, Jan 7, 2017 at 9:42 PM, Todd Gamblin ***@***.***> wrote:
Also we don't have to be the only source -- as the project gains ground we
may find that others are willing to provide mirrors.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2761 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AB1cd4nVMDl39h1CMJdKEzYHGx9R73l0ks5rQE0bgaJpZM4LcLpc>
.
|
|
I'm not sure how easy it will be to tell the difference. If it's something like libpng where we know there is a security vulnerability in older versions, then we should remove those versions. But as we get more and more packages in Spack, it will be more difficult to track down. I think the mirror should only provide versions present in Spack |
I'm actually not sure this is the way to go. I'd rather just keep the versions and warn the user about vulnerabilities. If we remove old versions, we make it hard to reproduce an old install that maybe used a vulnerable version. I think adding something to mark versions vulnerable or insecure (like preferred=True) might be better... |
Same as #2709 but for R packages.
@glennpj @JavierCVilla You two seem to contribute the most R packages. Do these changes look safe to you?