Improved support for Go based packages#14061
Improved support for Go based packages#14061hartzell wants to merge 18 commits intospack:developfrom
Conversation
|
Looks like a False Negative test failure. Can I rerun them w/out pushing a meaningless commit? |
Closing/reopening will do that. Also, I (as well as other Spack core devs) can restart the tests individually so I have done that in this case to speed things along (generally though the CI doesn't take long to re-run from scratch and will probably happen faster than pinging a core dev). |
adamjstewart
left a comment
There was a problem hiding this comment.
Left a few comments/suggestions. Overall, great work wrangling the Go build environment! My main question is why there are so many Go packages in Spack that don't use the GoPackage base class. Maybe we could do something like PerlPackage does where it supports both Makefile.PL and Build.PL build methods.
|
One more question. Should we preface all Go packages with |
I don't think so, any more than we have rust-ripgrep (hey, we don't have ripgrep!) or c-gcc or c-c++-fortran-openblas (just made that up...). Given the way that Go works, there will never be any Go library packages (they'll all be resources defined within their consumers) and while it's interesting to call out the implementation language in a Hacker News link I don't see any point here. |
Mostly because only the simplest of projects (e.g. build a single executable) can get away with it (e.g. The It's sophisticated enough (e.g. build tags, build constraints, But, like C/C++/Fortran/etc projects, there are things like running markdown through Plus, some people Just Like Makefiles(TM); more seriously their might be a go-centric way to build a project but the authors just reach for the tool they know. I thought about dragging bits of Go's story has also been evolving. The language itself has a crazy-good backward compatibility promise and story, but the path of the build tooling has been bumpier. In the early days it did a really good job of building things the way they do inside Google (mono-repo centric, etc...) but wasn't really a good fit for the unix-y way of doing things. GOPATH is the canonical example here. |
ps. There might be additional consolidation opportunities for applications that build with go, but I'd prefer to wait and a) see how projects adapt-to/adopt modules and b) get a handful of real use-cases for further generalization. Otherwise I think YAGNI. |
|
@adamjstewart -- any thoughts about the |
pps. It seems like there should be a place in the doc tree to have a general discussion of issues related to writing package definitions for applications written in , but I couldn't find one. I guess that python/perl/... discussions fall easily into {Python,Perl}Package and the details of using C end up in the compilers section while the actual package definition falls into e.g. MakefilePackage or ... For now I've tucked the discussion into GoPackage, but I'm wondering if there's a need (although YAGNI...) for another docs section? |
It's been years since we added a new directive. I see why it's useful here, so I have no objections to adding it. I'll let @scheibelp @becker33 @alalazo review that part. |
|
My last change to improve exception handling works for |
Seems that having erroneous packages in the mock repository leads to bad things happening. I've asked for feedback/help on the Google group, input welcome... |
Applications that use Go should not download dependencies at build
time. This commit adds documentation and tooling to that end. It
also cleans up existing packages, continuing to support existing
versions where possible and removing them when necessary.
It:
- Adds a new package directive, `import_resources`, which simplifies
importing large numbers of resources. `import_resources` reads a
JSON file containing an array of package definitions.
- Adds `GoPackage`, which provides support for simple Go applications
- adds `GoPackage`;
- hooks it into `spack create`; and
- documents it in the "Build Systems" section.
- Documents, in the "Build Systems" documentation for GoPackage, how
to handle:
- easy applications that use Go modules *and* vendor their
dependencies;
- slightly more complicated applications that use Go modules *but do
not* vendor their dependencies (use `go mod download` and
`modules2tuple` to generate resources that are imported via
`import_resources`); and
- difficult applications that need to be handled on a one-off basis.
- Adds/updates some existing packages to the use the New World Order:
- [x] direnv -- Had a vendor dir, but it wasn't being used, the fix
is slightly tricksy.
- No change in supported versions
- [x] docui -- A new package that uses GoPackage, it does not have
vendored dependencies so it declares resources via
`import_resources`.
- New application, no previously supported versions.
- [x] fzf -- The versions that we have use Glide and don't have a
vendor dir; their Makefile populates vendor at install time.
Newer releases (0.18.0, 0.19.0) use go.mod. This drops the older
releases as unsupportable, adds 0.1[89].0, and cleans up the
package.
- Deletes versions '0.17.5', '0.17.4', '0.17.3', '0.17.1', '0.17.0-2
'0.17.0', '0.16.11' '0.16.10' '0.16.9', '0.16.8'.
- Adds versions '0.19.0', '0.18.0'.
- [x] git-lfs -- It vendored its dependencies but they weren't being
used. Set GO111MODULE and GOFLAGS so that they are.
- No change in supported versions.
- [x] go-md2man -- Has vendored its dependencies, but we weren't
using them; switched to GoPackage and now we are.
- No change in supported versions.
- [x] hugo -- Well mannered, uses modules and `go build` but no
vendored depdendencies; switched to GoPackage and
`import_resources`.
- No change in supported versions.
- [x] lazydocker -- A new package that uses GoPackage and has
vendored dependencies.
- New application, no previously supported versions.
- [x] lazygit -- A new package that uses GoPackage and has vendored
dependencies.
- New application, no previously supported versions.
- [x] modules2tuple -- For now, my fork of the FreeBSD tool, which
can generate resource statements; it uses go.mod and has no
dependencies.
- When the dust settles my changes will be merged upstream and
this package should be updated to point there.
- New application, no previously supported versions.
- [x] rclone -- Well behaved, uses go.mod and has vendor its
dependencies; switched to GoPackage.
- No change in supported versions.
- [x] singularity - releases from 3.2.0-> seem to be well behaved
wrt modules. I've taken the liberty to drop 3.1.1 because
making it work is messy.
- Deletes '3.1.1'.
- [x] skopeo -- Recent versions support modules and have vendored
things. I've constrained go to @1.11 and disabled the older
versions...
- Deletes '0.1.37' and '0.1.36'.
- [x] umoci -- They have a vendor dir and @0.4.3: have go.mod, but
they don't appear to use it in their build. This touches up their
Makefile so that go runs in module mode and makes the install do
the right thing. BUT, only for 0.4.3:.
- Deletes '0.4.2', '0.4.1', and '0.4.0'.
As applications adopt the Go modules system and become easier to deal
with, the community will need to decide how to deprecate and remove
support for old, non-conforming versions.
Closes spack#13023
Closes spack#13973
- don't assume order in lists, use set()'s - touch up ReST in gopackage docs
...still need to add - Important Files section - address build_args - add section with links to additional reading.
Avoid footshooting if/when user overrides setup_build_environment and this ends up not being set....
- Not everything involved in a test needs a "test" prefix. Do what the other test repos do and use evocative names that begin with "import-resources". - Produce more useful exception messages, in particular for bad JSON errors, and update tests to test for them.
a93a1f1 to
6ba19aa
Compare
|
updated to pick up new singularity version and latest changes. |
This comment has been minimized.
This comment has been minimized.
scheibelp
left a comment
There was a problem hiding this comment.
I have some initial comments but I think I have a more fundamental question that would make sense to answer before you dive into addressing the individual comments (which most likely arises from my unfamiliarity with Go and its ecosystem):
I initially wondered why not use go get -d to fetch everything up-front. It looks like GoFetchStrategy already does go get -d, but doesn't handle the archiving of downloaded resources for offline installations; I assume that go get -d isn't repeatable in the sense that it's possible that different dependencies may be retrieved (e.g. if they are updated between the two invocation of go get -d) so checksums would be useless, but this could be similar to git (which also doesn't use checksums).
The only package that appears to use GoFetchStrategy is the-platinum-searcher, so I'm curious if there are other issues with go get -d. If GoFetchStrategy were updated to handle archival, do you think it would be easier to make Go packages in Spack (vs. using the approach you have provided here)?
| from spack import * | ||
|
|
||
|
|
||
| class ImportResourcesZero(Package): |
There was a problem hiding this comment.
Is this to test the robustness of the system, or is there a pragmatic use for empty resource files? I suppose if someone integrated modules2tuple into an automatic pipeline that it could generate empty resource files?
| # json error message never says "JSON", be helpful... | ||
| raise ResourcesFileError( | ||
| "Unable to load resources file, bad JSON: {0}".format(e)) | ||
| except Exception as e: |
There was a problem hiding this comment.
I'm thinking the only two exceptions that would be raised here are JSON issues, and SpackErrors (other types of errors would be bugs in Spack). Furthermore, the SpackErrors will already be formatted to our satisfaction, so I'm inclined to say there is no need to go out of our way to catch other exceptions other than the JSONDecodeError.
There was a problem hiding this comment.
So you're suggesting that I should delete the except Exception as e block? Anything else I need to do to pass the rest of the errors up (my python hats over on the coatrack...)?
| version('2.0', 'hash2') | ||
| version('1.0', 'hash1') | ||
|
|
||
| import_resources("resources.json") |
There was a problem hiding this comment.
I'm going to have to brainstorm more on how best to handle this:
Any test that ends up iterating over packages will fail because of this test package (because this is executed at package definition time). See https://travis-ci.org/spack/spack/jobs/633540326?utm_medium=notification&utm_source=github_status for an example.
Spack tests generally step around this issue by testing the underlying directive functions directly (see patch.py and variant.py test modules). You can also define an entirely separate mock repo (which would probably be useful for us to test other package failures).
There was a problem hiding this comment.
I'm going to have to brainstorm more on how best to handle this:
I'll wait and see what the storm brings.
| version('0.18.0', sha256='5406d181785ea17b007544082b972ae004b62fb19cdb41f25e265ea3cc8c2d9d') | ||
|
|
||
| depends_on('[email protected]:') | ||
| import_resources('resources-0.19.0.json') |
There was a problem hiding this comment.
Is there a reason you don't include a when constraint? I assume when='@0.19.0 would be correct. To be clear it's not technically wrong to make this unqualified.
There was a problem hiding this comment.
The when constraints are included in the json definitions, which are then fed into the resource parser. I fixed the bad example in the GoPackage docs above.
SummaryI've written up a couple responses to this and/but am having a tough time coming up with something brief/simple (cue obligatory Einstein quote). I can't seem to say less, but I can definitely say more, so if something's not clear, please ask.... This PR tries to meet Spack's goals (repeatability via signatures; caching and airgapped builds) using Spack's techniques and tooling. The things required to build a package have standard Spack signatures and are fetched, cached, and unpacked into the build tree using existing Spack machinery. It interacts with the Go build process via the In contrast to using our existing fetch machinery, it's clearly possible to write an additional fetcher that uses But, Furthermore, a I think that this is both "simple" and "easy", in Hickey's sense, while there may be many resources they're discrete (hence "simple") and the approach uses existing Spack tooling (hence "easy"). The one new bit of code, which automates resource definition creation, can share code with a sibling project (although if we have to Invent It Here, someone certainly can). It's not magical or lazy but I think that it strikes the correct balance between the effort it requires from package authors and the effort (additional code, cognitive/support load) it requires from the Spack tooling. It (and #13973) also opens the door to supporting other languages without explicitly building Spack packages for everything in their PyPi/CPAN equivalent. pypi-poet does/did something similar for Python applications in Homebrew (that ship has sailed in Spack, but it's still an interesting thought experiment, see also #8364). It might still work well for us with Lua, Haskell, Rust, or .... DetailsSome relevant Go factoids:
RepeatabilityA design for packaging Go-based applications in Spack needs to decide how to ensure that the things being used during the build are what they're expected to be. There seem to be three solutions:
The first choice isn't great, particularly given this PR as an existence proof of a better alternative. The second choice requires no additional work of package authors. The standard The third choice (which I went with) is to find some manageable unit, calculate Spack signatures for it and use it in the build process. This requires additional work by the package author/maintainer. It can use existing Spack mechanisms. It doesn't require additional conversations with our skeptics (or coworkers). The unit to be managed is an open question (see below). CachingSpack generally supports "air gapped" builds and packaging for Go-based applications should too. While a system that grabbed the dependencies at build time might meet Spack's repeatability goal (see above), it wouldn't allow offline builds. Spack has machinery for fetching and caching application source code and additional resources; this tooling is used by nearly all of the packaged applications. Bug fixes and improvements are localized and benefit all of the packages that rely on the standard machinery. The Go module system also supports caching via "proxies" that satisfy the requests made by That leaves the question of what unit to work with, how to fetch it, and how to wire it into the build:
Etc...While neither the structure of the things in See the parent RFC (#13023) for even more details, including a review of how other package managers support Go-based applications. |
|
ps. Thanks for the review, I'll follow up on the individual comments. |
These are docs and consistency changes. There are a couple more comments to address (exception handling and testing).
|
FWIW: here's a link to a discussion about deprecating GOPATH on the go-nuts mailing list. |
|
|
||
| The best approach to packaging a Go-based project for Spack depends on | ||
| the details of the project. ``GoPackage`` is a good fit for simple | ||
| situations but another package, e.g. ``MakefilePackage``, might be |
There was a problem hiding this comment.
Would it generally be advantageous to combine the logic from GoPackage and MakefilePackage? Is the issue with doing this the possibility of MRO conflicts?
If yes/yes: I could think about how to combine GoPackage with other build systems.
There was a problem hiding this comment.
I don't think so, at least immediately. If something uses a Makefile, than adding dependency on a go compiler, declaring resources via module2tuples (or by hand if you're a masochist), and using MakefilePackage seems like a nice, simple strategy. I'm open to something more complicated but don't see any need so far. At some point down the road it might also make sense to make the go compiler "virtual", satisfiable via go or gccgo or tinygo or one of the LLVM-based compilers. But not until there's a real use case.
FWIW, here's the current (including this PR) breakdown of build techniques:
| APP | STRATEGY |
|---|---|
| hugo | go build |
| rclone | go build |
| lazygit | go build |
| docui | go build |
| go-md2man | go build |
| lazydocker | go build |
| modules2tuple | go build |
| glow | go build |
| git-lfs | makefile |
| skopeo | makefile |
| umoci | makefile |
| singularity | makefile |
| fzf | makefile |
| direnv | makefile, w/ weird custom logic |
Is go.mod support required transitively? Or can dependencies exclude go.mod files and thereby introduce nondeterminism into a build? If go.mod ensures determinism for all transitive dependencies, then I think it would be worth trusting it. Responding more generally to #14061 (comment), there are two benefits of pursuing an automatic solution if it is available (even if it only applies to a subset of Go packages):
It sounds like there is at least one automatic solution that handles repeatability and caching/offline-installation, although it requires go.mod support. It looks like go.mod is expected to be the way forward (i.e. how all packages are distributed), although I don't know e.g. what percentage of commonly-used packages still need to be updated to use go modules. If that sounds correct then I don’t think
is a problem. For packages without go.mod support, I think your approach is ideal; it is perhaps the only way to ensure determinism, in particular across differing Spack instances (one Spack instance could cache the dependencies for a Go package and create the same result for a user on that particular instance, but there is no mechanism to ensure that a user with another Spack instance would use the same Go dependencies). However, if an automatic go.mod-based solution is possible, then it may be worthwhile to generate go.mod files directly vs. resource.json files |
|
Edit: I wish I had posted my second followup comment before this one, perhaps read it first.
An application uses Go modules to transitively describe its dependencies, allowing it to be built deterministically. Using Go modules is the officially supported mechanism for determinism. The details are messy and are considered private to the toolchain's implementation. The toolchain provides no support for integrating with external package managers. This PR, in a nutshell, trusts the modules data and uses it to build a set of resources which provide determinism and caching, uses those resources to construct a "vendor" dir, and configures the toolchain to use that vendor dir.
Perhaps Spack's packaging support should require the least possible effort on the part of package authors, but that's not the only interesting metric. The code base needs to be approachable and maintainable, the security/determinism guarantees need to be simple and easy to explain, etc. One of the things that I really like about Go is its emphasis on simplicity (simplicity is complicated, and made easy), even when it requires writing more (albeit simpler) code. One classic example is Go's eschewing exception handling in favor of returning (and checking) explicit error values. This PR's implementation seems to be the simplest (explicit, use of stable interfaces, minimal additional code) of the options.
I'm not sure what solution you're referring to, but please keep reading. Go modules are the futures, it's not worth adding Spack infrastructure for applications that don't use modules, they're one-offs or backwards-compatibility problems and should be handled as such. There are two issues: how to use the Go modules system to provide trustworthy (deterministic) builds and how to cache the dependencies. Solutions include:
Do you have a solution that isn't one of the options that I've outlined? Given the discussion above, how do you think an implementation should work, end-to-end? |
|
Perhaps I can try to summarize what I hear you suggesting/asking for. Given two implementations:
In both cases someone (the user or Spack) will need to ensure that proper flags/vars are set. PR reviewers will need to ensure that nothing slips through. I prefer the first. I think that it's clean, simple, adds a single new general feature that might be useful for other languages (but proof, pudding, etc) and takes advantage of an existing tool. I believe you're arguing for the second. For what it's worth, one could build the second around the vendor directory support.
A variation would be to build tooling that created a resource for the vendor directory, with signatures and etc. |
|
Regarding #14061 (comment)
No: I think you outlined what I was hoping for with:
On that note, is there a reason to doubt Go's determinism guarantees? Are there cases where it is officially nondeterministic? To me it seems like using modules2tuple places a similar trust in Go
Would go.mod + Regarding #14061 (comment)
Yes. Or at least: I want to explore whether there are advantages to the second.
It would: a GoFetchStrategy could be updated to specifically retrieve all the artifacts associated with a go.mod file. It would also be responsible for storing these artifacts in the cache, and furthermore later on it would be responsible for understanding that it doesn't need to retrieve an artifact that is mentioned in a go.mod because it is already in the cache. If that functions correctly, then only two things have to be true for a Spack Go package to get reproducible and/or offline installs:
As more Go packages include a go.mod file, the amount of work required for the Spack package creator will be very small (in the alternative I am proposing, they don't have to use modules2tuple, although they may want to use it if it generates go.mod files and the Go package doesn't include one). |
|
NOTE: I've been loosely claiming that the json resources include a "signature". I just reviewed what I actually wrote in |
Not that I'm aware of. I have at least as much faith in it as I do in Spack's signatures, based simply on the eye-share and the amount of money behind the effort. I, personally, would be happy to depend on it. For the purposes of this PR, however:
It doesn't. In this workflow the Spack package author runs
The Spack package author could/should review the contents of the vendor dir they create and confirm that the I believe that this parallels the
In contrast, a All of that said, I'm OK with this approach. But, I doubt that I'm a good speaker for the more security-conscious members of the Spack community.
We need to ensure the compiler's running in module mode and we need to invoke the use of the vendor dir. This can be achieved via environment variables and/or command line switches and varies slightly with compiler versions. I don't have the details at hand, but it's doable. My main point with this bit was that even if we set things up the way we need them, an sufficiently foolish package can screw it up. This is true for both this PR's implementation or one that uses a new GoFetchStrategy. It doesn't matter how much care we put into building the vendor dir if the
That'd be great. If the powers-that-be approve of this approach then it's the way to go.
If an application doesn't use Go modules, then there's no supported, systematic mechanism for deterministic builds. NOTE: I've been loosely claiming that the json resources include a "signature". I just reviewed what I actually wrote and it's referring to commits in a Git repository, which falls under our "we trust git commit hashes" promise. |
|
W.R.T. support in the
|
|
@hartzell what's the status of this PR? Anything I can do to help get it merged? |
|
Thanks @hartzell for preparing this PR. I have a few concerns regarding scenario 1 "dependencies are resources":
Sorry if asking something obvious. |
|
Closing since this has been stale for a long while. Feel free to reopen if you want to continue working on it. |
TODO:
spack createscrews up when it encounters the common Go application practice of attaching tarballs containing pre-built binaries to each GitHub release. Life would be easier if we resolved create (via find_versions_of_archive()) does wrong thing for Go projects on GitHub with GoPackage #13940.Applications that use Go should not download dependencies at build
time. This commit adds documentation and tooling to that end. It
also cleans up existing packages, continuing to support existing
versions where possible and removing them when necessary.
It:
Adds a new package directive,
import_resources, which simplifiesimporting large numbers of resources.
import_resourcesreads aJSON file containing an array of package definitions.
Adds
GoPackage, which provides support for simple Go applicationsGoPackage;spack create; andDocuments, in the "Build Systems" documentation for GoPackage, how
to handle:
dependencies;
not vendor their dependencies (use
go mod downloadandmodules2tupleto generate resources that are imported viaimport_resources); andAdds/updates some existing packages to the use the New World Order:
is slightly tricksy.
vendored dependencies so it declares resources via
import_resources.vendor dir; their Makefile populates vendor at install time.
Newer releases (0.18.0, 0.19.0) use go.mod. This drops the older
releases as unsupportable, adds 0.1[89].0, and cleans up the
package.
'0.17.0', '0.16.11' '0.16.10' '0.16.9', '0.16.8'.
used. Set GO111MODULE and GOFLAGS so that they are.
does not vendor it's dependencies.
using them; switched to GoPackage and now we are.
go buildbut novendored depdendencies; switched to GoPackage and
import_resources.vendored dependencies.
dependencies.
can generate resource statements; it uses go.mod and has no
dependencies.
this package should be updated to point there.
dependencies; switched to GoPackage.
wrt modules. I've taken the liberty to drop 3.1.1 because
making it work is messy.
things. I've constrained go to @1.11 and disabled the older
versions...
they don't appear to use it in their build. This touches up their
Makefile so that go runs in module mode and makes the install do
the right thing. BUT, only for 0.4.3:.
As applications adopt the Go modules system and become easier to deal
with, the community will need to decide how to deprecate and remove
support for old, non-conforming versions.
Closes #13023
Closes #13973