Conversation
|
Love it... As discussed in Slack. IMHO the first iteration should be dead stupid and simple, e.g. (need to think about this more):
As a second step each environment has
|
df1806b to
7265c81
Compare
7265c81 to
7fbe172
Compare
63a23a5 to
7ce47e5
Compare
|
@haampie this is awesome! Is there a way we could integrate this into the existing mirror/buildcache commands without adding another Spack command just for OCI caches? |
|
Exactly what I want to do! Wanna discuss the API and play around with it? I've also created an example for Github actions here: https://github.com/haampie/spack-oci-buildcache-example |
|
Awesome! Would be happy to chat and play around with this! I've wanted to create a https://nixery.dev like system for Spack for a while and this looks like a great start to that effort. |
|
Yeah, it's much like that, it would be really interesting to generate manifests dynamically, which is what they do I guess |
|
Regarding signing, if we consider
So many things are similar to our current
Also note that the signature may use a fancy method, but ultimately it's as strong as * @scottwittenburg do we use the I also wanna point out that their README says:
This is entirely true, we also have these issues:
We might not need to use / bootstrap # Create a file
$ echo "This is a file" > hello
# Create private and public keys
$ openssl ecparam -genkey -name secp384r1 -out private.pem
$ openssl ec -in private.pem -pubout -out public.pem
# Sign it
$ openssl dgst -sign private.pem hello > hello.sig
# Verify
$ openssl dgst -verify public.pem -signature hello.sig hello
Verified OK
# Modify, then verify should fail
$ echo "This data is tampered with!" > hello
$ openssl dgst -verify public.pem -signature hello.sig hello
Verification failureThen container runtimes that somehow understand cosign will be able to verify too? |
|
Not immediately impressed with sigstore for
|
I don't think the RE how we do use the |
c283aa6 to
07fa21c
Compare
4f58240 to
28bbd3e
Compare
e2b43b0 to
61d4d57
Compare
f7e7ac3 to
1617dfb
Compare
|
Out of curiosity, did you try |
There was a problem hiding this comment.
This is really awesome!
This isn't a full review, but I think it hits on some of the major points, and I can finish going through the rest later.
Signing
I think leaving signing for later is fine. I thought about it and given the different trust model for registries (most users trust the registry or the uploader for an "official" part of the registry), this is very usable as-is. There is a lot to think about for signing as you outlined above.
Secrets
See requests below about secrets.
Factoring
There are clearly some parts where the abstraction is leaky -- stage and the buildcache code seem to really need to know what type of mirror they're talking to (and stage just skips OCI ones?)
Layout
I keep going back and forth on this:
On the one hand, I kind of don't care about what GitHub shows, as long as I can push and pull from the buildcache. On the other hand, it'd be nice if the UI were more intelligible to a passer-by. So I have some sort of stream-of-consciousness questions here:
- Does it make sense the way every buildcache is an image, and the packages are versions of that image?
- I could imagine people wanting to use this not just as a buildcache but as a way to version a spack-built container image over time, but this overloads the tags for package versions. If you wanted to version a stack over time, how would you do that? Another tag? I guess it's ok for image versions to coexist with the package versions, but they're not very discoverable.
- I could see a version of this where every Spack package ends up being an actual GitHub packages package in the web UI (i.e., here: https://github.com/haampie?tab=packages). The versioning becomes more straightforward that way... but pulling an entire buidlcache would be really complicated, right? Every "container" there corresponds to a GitHub repo right? So we really need to fit the entire builcache within one OCI container image in the registry? That makes me think this PR is doing the right thing.
- It seems like the UI for any file shows a
docker pullcommand: https://github.com/haampie/spack-oci-buildcache-example/pkgs/container/spack-oci-buildcache-example/116359870?tag=index.spack. Is that just "what you get" when using OCI? I was wondering if maybe ORAS content-type tags would help. I looked at this: https://www.kenmuse.com/blog/universal-packages-on-github-with-oras/ but it doesn't really show you what that looks like in GH packages -- just what I looks like from the ORAS CLI. Curious whether @ChristianKniep has insights on this.
Anyway, all of those are just my personal observations, but one last question that matters more for review:
- How do we manage the OCI buildcache layout over time? Suppose I decide to change the way Spack OCI buildcaches are handled. How do current versions of Spack know that an OCI registry is too new, and how do new versions of Spack know that it is too old? It think we at least need a feature like this (something we've omitted in a few other places in Spack at our peril) to allow this feature to adapt over time.
- How is the buildcache format here, for OCI registries, tied to the regular filesystem buildcache format? Is that documented somewhere? If one changes, what happens to the other? How does that evolve over time? Probably there needs to be some dev documentation on how we've architected mirror backends (which is really what this is -- a new mirror backend that doesn't quite fit the traditional filesystem mold)
Ok those are my thoughts -- otherwise, this looks really good and I think it's close to complete. See below for further more detailed questions.
|
|
||
| .. code-block:: console | ||
|
|
||
| $ spack mirror add --oci-username username --oci-password password my_registry oci://example.com/my_image |
There was a problem hiding this comment.
I am not sure we should be storing clear passwords in config YAML. We've so far avoided having to do this... is there another way to do it for these registries?
Passwords are going to end up in weird places (like spack.yaml) and get pushed to public repos, and this is harder to control/understand with our layered config system.
I am not sure what a good way to do this is, but I looked around -- the two ideas that stood out to me were:
- Make it so that values of secret fields to be grabbed from env vars instead of encoded in the YAML, and make the value of the field an env var.
- Something like the
!secretannotation here: https://www.home-assistant.io/docs/configuration/secrets/, where the value is!secret name_of_secretandname_of_secretis a key in some separatesecrets.yamlthat you can easily exclude from a git repo. - Some variant of (2) where the user doesn't have to make something
!secret-- we could do it in the schema.
I haven't really dug into this -- do you know of better approaches?
| mirror_urls = [ | ||
| url_util.join(mirror.fetch_url, rel_path) | ||
| for mirror in spack.mirror.MirrorCollection(source=True).values() | ||
| if not mirror.fetch_url.startswith("oci://") |
There was a problem hiding this comment.
This seems nasty -- shouldn't I be able to use an OCI mirror here? Why does stage care?
There was a problem hiding this comment.
oci:// isn't tree based like http / ftp / s3 / etc, so url_util.join(fetch_url, rel_path) makes no sense.
lib/spack/spack/util/crypto.py
Outdated
|
|
||
|
|
||
| def checksum(hashlib_algo, filename, **kwargs): | ||
| def checksum_fp(hashlib_algo, fp, *, block_size=2**20): |
There was a problem hiding this comment.
suggest renaming to checksum_stream (or something more intuitive than fp?), also needs a docstring
There was a problem hiding this comment.
sprinkled some typehints around too
| # TODO: refactor this to some "nice" place. | ||
| if parsed.scheme == "oci": | ||
| ref = spack.oci.image.ImageReference.from_string(mirror[len("oci://") :]).with_tag( | ||
| spack.oci.image.default_tag(spec) | ||
| ) | ||
|
|
||
| # Fetch the manifest | ||
| try: | ||
| response = spack.oci.opener.urlopen( | ||
| urllib.request.Request( | ||
| url=ref.manifest_url(), | ||
| headers={"Accept": "application/vnd.oci.image.manifest.v1+json"}, | ||
| ) | ||
| ) | ||
| except Exception: | ||
| continue | ||
|
|
||
| # Download the config = spec.json and the relevant tarball | ||
| try: | ||
| manifest = json.loads(response.read()) | ||
| spec_digest = spack.oci.image.Digest.from_string(manifest["config"]["digest"]) | ||
| tarball_digest = spack.oci.image.Digest.from_string( | ||
| manifest["layers"][-1]["digest"] | ||
| ) | ||
| except Exception: | ||
| continue | ||
|
|
||
| with spack.oci.oci.make_stage( | ||
| ref.blob_url(spec_digest), spec_digest, keep=True | ||
| ) as local_specfile_stage: | ||
| try: | ||
| local_specfile_stage.fetch() | ||
| local_specfile_stage.check() | ||
| except Exception: | ||
| continue | ||
| local_specfile_stage.cache_local() | ||
|
|
||
| with spack.oci.oci.make_stage( | ||
| ref.blob_url(tarball_digest), tarball_digest, keep=True | ||
| ) as tarball_stage: | ||
| try: | ||
| tarball_stage.fetch() | ||
| tarball_stage.check() | ||
| except Exception: | ||
| continue | ||
| tarball_stage.cache_local() | ||
|
|
||
| return { | ||
| "tarball_stage": tarball_stage, | ||
| "specfile_stage": local_specfile_stage, | ||
| "signature_verified": False, | ||
| } | ||
|
|
||
| else: |
There was a problem hiding this comment.
I don't really like the two code paths here (I think you don't either)... is there a better way to factor this code? At the very least I'd separate the methods, or maybe stick the OCI part in the spack.oci module, but I'm not sure. It seems like there should be one single place for each mirror implementation... but it's kind of split now between build caches and source mirroring code.
There was a problem hiding this comment.
I'm thinking it would be easier to have them share code if they're more structurally similar, which requires an overhaul of the buildcache file structure.
I ran into a race condition in Gitlab CI, where two pipelines concurrently built the same spec, but produced a different tarball (so different shasum); the tarballs were uploaded from A first, then B, whereas spec.json was uploaded from B first, then A, so that spec.json listed the wrong shasum.
That can be avoided by actually making standard buildcache tarballs content-adressable, which mimics OCI structure. Then in the same situation as above we'd end up with two uploaded tarballs (one "dangling"), and a single (overwritten) spec.json, always listing a valid tarball shasum.
The assumption being that two builds of the same dag hash always produce an equivalent (but not necessarily bit-wise identical) tarball.
So, if buildcaches work like that, their (spec.json, tarball) pairs are pretty close to (manifest, layer) pairs from the OCI spec.
There was a problem hiding this comment.
Decided in meeting that we should get OCI buildcache mostly as-is initially, then migrate the existing buildcache format to be content-addressed. So this can be a nice refactor later on, but doesn't prevent getting this in now.
I'm following the OCI Distribution Spec v1.0, and that only defines one endpoint for listing things: https://github.com/opencontainers/distribution-spec/blob/v1.0/spec.md#content-discovery, so tags are the only way to reindex. The ORAS spec is not finalized, so I'm not using it. My impression was that ORAS needs support in the registry w.r.t. garbage collection of dangling blobs, potentially an OCI spec v1 registry could delete blobs still required under ORAS conventions (but I have to double check this).
Not entirely following :p isn't this an issue with current buildcaches too? If you concretize --fresh the same environment at different times, but push to the same buildcache, you get tons of different flavors of the same package, and there's no way to know which packages are new/current.
Haven't really checked this, but the docker pull command does work ;p replacing it with
I could add an annotation in the manifest file, but other things such as the tag name structure are hard to change, so we just have to get it right.
It's very different, I can document that. That's also the main reason the code is split, cause I cannot reuse stuff with computed urls (for spec.json / tarball) that assume a filesystem like structure, it's only using an OCI API. |
891ec99 to
3922d4e
Compare
f5b7546 to
a41fd76
Compare
a41fd76 to
7b6033d
Compare
|
@haampie: LGTM! Can you please also add some documentation? |
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.
With this you can configure an OCI registry as a buildcache:
which should result in this config:
It can be used like any other registry
It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with
docker pullorskopeo copy.In fact, a base image can be added to get a runnable image:
which should really be a game changer for sharing binaries.
Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
pushcommands faster, as well aspushfollowed by a separateupdate-indexcommand.An end to end example of how to use this in Github Actions is here:
https://github.com/haampie/spack-oci-buildcache-example
TODO:
specjson (this is allowed in the OCI specification)<image>:indexby fetching all config files (using OCI distribution discovery API)spack install.spack buildcachecommandNOTE:
oci://is unfortunately taken, so it's being abused in this PR to mean "oci type mirror".skopeousesdocker://which I'd like to avoid, given that classical docker v1 registries are not supported.https-only, given that basic auth is used to login. I could be convinced to allow http, but I'd prefer not to, given that for aspack buildcache pushcommand multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.CAVEATS:
gpg --clearsignis not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing toolsspack.binary_distribution.pushis not yet implemented for the OCI buildcache, onlyspack buildcache pushis. This is because I'd like to always push images + deps to the registry, so that it'sdocker pull-able, whereas inspack ciwe really wanna push an individual package without its deps to saypr-xyz, while its deps reside in somedevelopbuildcache.push -j ...flag only works for OCI buildcache, not for others