Skip to content

OCI buildcache#38358

Merged
tgamblin merged 3 commits intospack:developfrom
haampie:feature/oci-buildcache
Oct 27, 2023
Merged

OCI buildcache#38358
tgamblin merged 3 commits intospack:developfrom
haampie:feature/oci-buildcache

Conversation

@haampie
Copy link
Copy Markdown
Member

@haampie haampie commented Jun 13, 2023

Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials

which should result in this config:

mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]

It can be used like any other registry

spack buildcache push my_registry [specs...]

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with docker pull or skopeo copy.

In fact, a base image can be added to get a runnable image:

$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
push commands faster, as well as push followed by a separate
update-index command.

An end to end example of how to use this in Github Actions is here:

https://github.com/haampie/spack-oci-buildcache-example

TODO:

NOTE:

  1. oci:// is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". skopeo uses docker:// which I'd like to avoid, given that classical docker v1 registries are not supported.
  2. this is currently https-only, given that basic auth is used to login. I could be convinced to allow http, but I'd prefer not to, given that for a spack buildcache push command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

  1. Signing is not implemented in this PR. gpg --clearsign is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
  2. spack.binary_distribution.push is not yet implemented for the OCI buildcache, only spack buildcache push is. This is because I'd like to always push images + deps to the registry, so that it's docker pull-able, whereas in spack ci we really wanna push an individual package without its deps to say pr-xyz, while its deps reside in some develop buildcache.
  3. The push -j ... flag only works for OCI buildcache, not for others

@ChristianKniep
Copy link
Copy Markdown
Contributor

ChristianKniep commented Jun 13, 2023

Love it... As discussed in Slack. IMHO the first iteration should be dead stupid and simple, e.g. (need to think about this more):

  1. each package creates an OCI layer with SPACK_ROOT=/
  2. a dummy manifest and config which do not result in a runnable container

As a second step each environment has

  1. A manifest that assembles the layers from a multi-stage Dockerfile
FROM pkg1:<hash> AS pkg-1
FROM pkg2:<hash> AS pkg-2
FROM ubuntu:22.04
COPY --from=pkg-1 / /opt/spack/linux-os-arch/pkg1-<hash>
COPY --from=pkg-2 / /opt/spack/linux-os-arch/pkg2-<hash>
  1. An config that sets the PATH
  2. maybe an additional layer that create a spack view

@spackbot-app spackbot-app bot added the tests General test capability(ies) label Jun 15, 2023
@haampie haampie force-pushed the feature/oci-buildcache branch from df1806b to 7265c81 Compare June 29, 2023 10:30
@haampie haampie force-pushed the feature/oci-buildcache branch from 7265c81 to 7fbe172 Compare June 29, 2023 10:32
@spackbot-app spackbot-app bot added the stage label Jun 29, 2023
@haampie haampie force-pushed the feature/oci-buildcache branch from 63a23a5 to 7ce47e5 Compare July 1, 2023 10:56
@alecbcs
Copy link
Copy Markdown
Member

alecbcs commented Jul 3, 2023

@haampie this is awesome! Is there a way we could integrate this into the existing mirror/buildcache commands without adding another Spack command just for OCI caches?

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Jul 3, 2023

Exactly what I want to do! Wanna discuss the API and play around with it? I've also created an example for Github actions here: https://github.com/haampie/spack-oci-buildcache-example

@alecbcs
Copy link
Copy Markdown
Member

alecbcs commented Jul 3, 2023

Awesome! Would be happy to chat and play around with this! I've wanted to create a https://nixery.dev like system for Spack for a while and this looks like a great start to that effort.

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Jul 3, 2023

Yeah, it's much like that, it would be really interesting to generate manifests dynamically, which is what they do I guess

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Jul 5, 2023

Regarding signing, if we consider sigstore/cosign, here's a high-level overview of what I think it does (just reading the docs):

  • We create an image tag image:zlib-1.2.3-spackhash
  • Creating a tag is the same as uploading a manifest file
  • This manifest refers to spec.json (as "config") and the tarball (as "layer") by a secure (sha256) content hash
  • The manifest's own content hash are computed, lets call it H := sha256_digest(manifest image:zlib-1.2.3-spackhash)
  • cosign creates a new tag image:sha256-{H}.sig, and stores signatures in annotations

So many things are similar to our current spec.json.sig stuff, except that:

  1. With cosign we sign the manifest that references both spec.json and the tarball by content-hash. Whereas in the current buildcache we (clear)sign the spec.json directly, which references the tarball content-hash. So, with this, there's one level of indirection.
  2. This means we do a couple extra requests:
    1. fetch the image:zlib-1.2.3-spackhash manifest
    2. compute the content hash, and fetch the corresponding image:sha256-<manifest hash>.sig manifest
    3. verify the signature, and when it's OK:
    4. fetch the spec.json "config", verify shasum*
    5. fetch the tarball "layer", verify shasum

Also note that the signature may use a fancy method, but ultimately it's as strong as sha256, since we have to locate the signature by that hash (I mean, if someone obtained push access and could create a sha256 collision, they could replace the signature). That to me suggests that the a stronger type of encryption for the signature bits is not particularly helpful, ultimately we trust in sha256...

* @scottwittenburg do we use the spec.json file at all on spack install, other than extracting the tarball shasum from it? It's not used to guard against spack dag hash collisions? (e.g. remote /abc... corresponds to zlib, but you meant to install zstd with the same hash)


I also wanna point out that their README says:

The naming convention and read-modify-write update patterns we use to store things in a registry are a bit, well, "hacky". I think they're the best (only) real option available today, but if the registry API changes we can improve these.

This is entirely true, we also have these issues:

  1. reindex: the reduction of all spec.json to index.json is the same read-modify-write issue; it has races.
  2. Given that the OCI api only supports listing tags, each package is a tag, and yeah, you have to rely on naming conventions like <image>:<name>-<version>-<hash> triplets for individual packages, and <image>:index for Spack's buildcache index, and now on top of it cosign's convention for storing signatures in <image>:sha256-<manifest hash>.sig

We might not need to use / bootstrap cosign, but instead follow their conventions, and use OpenSSL with elliptic curves internally:

# Create a file
$ echo "This is a file" > hello

# Create private and public keys
$ openssl ecparam -genkey -name secp384r1 -out private.pem
$ openssl ec -in private.pem -pubout -out public.pem

# Sign it
$ openssl dgst -sign private.pem hello > hello.sig

# Verify
$ openssl dgst -verify public.pem -signature hello.sig hello
Verified OK

# Modify, then verify should fail
$ echo "This data is tampered with!" > hello
$ openssl dgst -verify public.pem -signature hello.sig hello
Verification failure

Then container runtimes that somehow understand cosign will be able to verify too?

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Jul 5, 2023

Not immediately impressed with sigstore for two three four reasons:

  1. On first use they want you to agree to certain terms and conditions:

    The sigstore service, hosted by sigstore a Series of LF Projects, LLC, is provided pursuant to the Hosted Project Tools Terms of Use, available at https://lfprojects.org/policies/hosted-project-tools-terms-of-use/.
    Note that if your submission includes personal data associated with this signed artifact, it will be part of an immutable record.
    This may include the email address associated with the account with which you authenticate your contractual Agreement.
    This information will be used for signing this artifact and will be stored in public transparency logs and cannot be removed later, and is subject to the Immutable Record notice at https://lfprojects.org/policies/hosted-project-tools-immutable-records/.
    
  2. I'm happy to use standards (which sigstore doesn't claim to be, but anyways...) , but this data format is just very awkward? See below:

    The generated manifest looks like this:

    {
      "schemaVersion": 2,
      "mediaType": "application/vnd.oci.image.manifest.v1+json",
      "config": {
        "mediaType": "application/vnd.oci.image.config.v1+json",
        "size": 248,
        "digest": "sha256:41b007e7265e9c9d9949067f47e924b41353f5b93fa4cf908ba5114813b77e6b"
      },
      "layers": [
        {
          "mediaType": "application/vnd.dev.cosign.simplesigning.v1+json",
          "size": 260,
          "digest": "sha256:f0e4c41b3ab83e240bfd1bebc9c51f819425335035ad58a72f82c503f0fca2c8",
          "annotations": {
            "dev.cosignproject.cosign/signature": "MEQCIApTfLGmItW+jvEZ1Zrj+Y7mKcp6OX4qpZ88eOezp47rAiB6l0hRcM6XT9PVFyxDU9fm3J2LFSBnyGXtVlFZuOlzQg==",
            "dev.sigstore.cosign/bundle": "{\"SignedEntryTimestamp\":\"MEUCICSyloVDj1MrZz6Kzqo3pTwJNHma4VC1XZ27IXt89Kd7AiEAuLO2YxcVwJ0UZRwFO+VSoUemH4ByGRXpeVaOkb0C2cI=\",\"Payload\":{\"body\":\"eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiaGFzaGVkcmVrb3JkIiwic3BlYyI6eyJkYXRhIjp7Imhhc2giOnsiYWxnb3JpdGhtIjoic2hhMjU2IiwidmFsdWUiOiJmMGU0YzQxYjNhYjgzZTI0MGJmZDFiZWJjOWM1MWY4MTk0MjUzMzUwMzVhZDU4YTcyZjgyYzUwM2YwZmNhMmM4In19LCJzaWduYXR1cmUiOnsiY29udGVudCI6Ik1FUUNJQXBUZkxHbUl0VytqdkVaMVpyaitZN21LY3A2T1g0cXBaODhlT2V6cDQ3ckFpQjZsMGhSY002WFQ5UFZGeXhEVTlmbTNKMkxGU0JueUdYdFZsRlp1T2x6UWc9PSIsInB1YmxpY0tleSI6eyJjb250ZW50IjoiTFMwdExTMUNSVWRKVGlCUVZVSk1TVU1nUzBWWkxTMHRMUzBLVFVacmQwVjNXVWhMYjFwSmVtb3dRMEZSV1VsTGIxcEplbW93UkVGUlkwUlJaMEZGU1RNeVJGZExlVzk1ZGxsRk0xZFhNaTlXVVM5U2VscEpkbVpoZHdwak9XWk5XbTlST0d3MmFHRTBVV3AyYm1Kak4wMXFNVGh5VTFBcmIzQTJOaXRKV2xSRk1sUmliMVoxYjJ0clowcFRLM1U0VEdWWFNITm5QVDBLTFMwdExTMUZUa1FnVUZWQ1RFbERJRXRGV1MwdExTMHRDZz09In19fX0=\",\"integratedTime\":1688561322,\"logIndex\":26254264,\"logID\":\"c0d23d6ad406973f9559f3ba2d1ca01f84147d8ffc5b8445c224f98b9591801d\"}}"
          }
        }
      ]
    }
    

    So, the dev.sigstore.cosign/bundle annotation itself is a JSON string, which is

    {
      "SignedEntryTimestamp": "MEUCICSyloVDj1MrZz6Kzqo3pTwJNHma4VC1XZ27IXt89Kd7AiEAuLO2YxcVwJ0UZRwFO+VSoUemH4ByGRXpeVaOkb0C2cI=",
      "Payload": {
        "body": "eyJhcGlWZXJzaW9uIjoiMC4wLjEiLCJraW5kIjoiaGFzaGVkcmVrb3JkIiwic3BlYyI6eyJkYXRhIjp7Imhhc2giOnsiYWxnb3JpdGhtIjoic2hhMjU2IiwidmFsdWUiOiJmMGU0YzQxYjNhYjgzZTI0MGJmZDFiZWJjOWM1MWY4MTk0MjUzMzUwMzVhZDU4YTcyZjgyYzUwM2YwZmNhMmM4In19LCJzaWduYXR1cmUiOnsiY29udGVudCI6Ik1FUUNJQXBUZkxHbUl0VytqdkVaMVpyaitZN21LY3A2T1g0cXBaODhlT2V6cDQ3ckFpQjZsMGhSY002WFQ5UFZGeXhEVTlmbTNKMkxGU0JueUdYdFZsRlp1T2x6UWc9PSIsInB1YmxpY0tleSI6eyJjb250ZW50IjoiTFMwdExTMUNSVWRKVGlCUVZVSk1TVU1nUzBWWkxTMHRMUzBLVFVacmQwVjNXVWhMYjFwSmVtb3dRMEZSV1VsTGIxcEplbW93UkVGUlkwUlJaMEZGU1RNeVJGZExlVzk1ZGxsRk0xZFhNaTlXVVM5U2VscEpkbVpoZHdwak9XWk5XbTlST0d3MmFHRTBVV3AyYm1Kak4wMXFNVGh5VTFBcmIzQTJOaXRKV2xSRk1sUmliMVoxYjJ0clowcFRLM1U0VEdWWFNITm5QVDBLTFMwdExTMUZUa1FnVUZWQ1RFbERJRXRGV1MwdExTMHRDZz09In19fX0=",
        "integratedTime": 1688561322,
        "logIndex": 26254264,
        "logID": "c0d23d6ad406973f9559f3ba2d1ca01f84147d8ffc5b8445c224f98b9591801d"
      }
    }

    where the payload body is a base64 encoded string, which decodes to (you guessed it...) JSON:

    {
      "apiVersion": "0.0.1",
      "kind": "hashedrekord",
      "spec": {
        "data": {
          "hash": {
            "algorithm": "sha256",
            "value": "f0e4c41b3ab83e240bfd1bebc9c51f819425335035ad58a72f82c503f0fca2c8"
          }
        },
        "signature": {
          "content": "MEQCIApTfLGmItW+jvEZ1Zrj+Y7mKcp6OX4qpZ88eOezp47rAiB6l0hRcM6XT9PVFyxDU9fm3J2LFSBnyGXtVlFZuOlzQg==",
          "publicKey": {
            "content": "LS0tLS1CRUdJTiBQVUJMSUMgS0VZLS0tLS0KTUZrd0V3WUhLb1pJemowQ0FRWUlLb1pJemowREFRY0RRZ0FFSTMyRFdLeW95dllFM1dXMi9WUS9SelpJdmZhdwpjOWZNWm9ROGw2aGE0UWp2bmJjN01qMThyU1Arb3A2NitJWlRFMlRib1Z1b2trZ0pTK3U4TGVXSHNnPT0KLS0tLS1FTkQgUFVCTElDIEtFWS0tLS0tCg=="
          }
        }
      }
    }

    which in turn specifies a base64 encoded public key

    -----BEGIN PUBLIC KEY-----
    MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEI32DWKyoyvYE3WW2/VQ/RzZIvfaw
    c9fMZoQ8l6ha4Qjvnbc7Mj18rSP+op66+IZTE2TboVuokkgJS+u8LeWHsg==
    -----END PUBLIC KEY-----
    

    that is: a base64 encoded public key, inside a base64 encoded json string, inside a json encoded string, inside a json value.

    and then there's a separate json file (the "layer") that contains more signature details and references back to the original manifest that was signed (but I think that can be ignored):

    {
      "critical": {
        "identity": {
          "docker-reference": "ghcr.io/haampie/spack-oci-buildcache-example"
        },
        "image": {
          "docker-manifest-digest": "sha256:1cb102d01cc3443eeb9feb40b18ba64e1e6f2fffe96a9d6abfc5c311fc6f2765"
        },
        "type": "cosign container image signature"
      },
      "optional": null
    }
  3. The binary is like 90MB?

  4. Their generated private keys are a custom (?) format. Of course it involves base64 encoded json 😏 which contains the details about password encryption of the private key

    {
        "kdf": {
          "name": "scrypt",
          "params": {
            "N": 32768,
            "r": 8,
            "p": 1
          },
          "salt": "..."
        },
        "cipher": {
          "name": "nacl/secretbox",
          "nonce": "..."
        },
        "ciphertext": "..."
      }

@scottwittenburg
Copy link
Copy Markdown
Contributor

do we use the spec.json file at all on spack install, other than extracting the tarball shasum from it? It's not used to guard against spack dag hash collisions? (e.g. remote /abc... corresponds to zlib, but you meant to install zstd with the same hash)

I don't think the spec.json is used during install to guard against dag hash collisions. IIRC, spack assumes the probability of that is so low that it ignores the possibility. But you or @tgamblin might know better.

RE how we do use the spec.json: It contains (as you suggested) the tarball checksum, as well as some other metadata that binary_distribution.py cares about: the buildcache layout version number, and the buildinfo dictionary which contains the original install path and a couple other bits useful for relocation.

@haampie haampie force-pushed the feature/oci-buildcache branch 2 times, most recently from c283aa6 to 07fa21c Compare July 19, 2023 16:20
@haampie haampie force-pushed the feature/oci-buildcache branch from 4f58240 to 28bbd3e Compare July 24, 2023 13:16
@haampie haampie force-pushed the feature/oci-buildcache branch from e2b43b0 to 61d4d57 Compare July 25, 2023 11:19
@haampie haampie marked this pull request as ready for review July 27, 2023 16:34
@spackbot-app spackbot-app bot added the documentation Improvements or additions to documentation label Jul 30, 2023
@haampie haampie force-pushed the feature/oci-buildcache branch from f7e7ac3 to 1617dfb Compare August 1, 2023 11:06
@haampie
Copy link
Copy Markdown
Member Author

haampie commented Aug 31, 2023

Out of curiosity, did you try zfs?

@tgamblin tgamblin self-assigned this Sep 6, 2023
@tgamblin tgamblin self-requested a review September 6, 2023 09:47
Copy link
Copy Markdown
Member

@tgamblin tgamblin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really awesome!

This isn't a full review, but I think it hits on some of the major points, and I can finish going through the rest later.

Signing

I think leaving signing for later is fine. I thought about it and given the different trust model for registries (most users trust the registry or the uploader for an "official" part of the registry), this is very usable as-is. There is a lot to think about for signing as you outlined above.

Secrets

See requests below about secrets.

Factoring

There are clearly some parts where the abstraction is leaky -- stage and the buildcache code seem to really need to know what type of mirror they're talking to (and stage just skips OCI ones?)

Layout

I keep going back and forth on this:

https://github.com/haampie/spack-oci-buildcache-example/pkgs/container/spack-oci-buildcache-example

On the one hand, I kind of don't care about what GitHub shows, as long as I can push and pull from the buildcache. On the other hand, it'd be nice if the UI were more intelligible to a passer-by. So I have some sort of stream-of-consciousness questions here:

  1. Does it make sense the way every buildcache is an image, and the packages are versions of that image?
  2. I could imagine people wanting to use this not just as a buildcache but as a way to version a spack-built container image over time, but this overloads the tags for package versions. If you wanted to version a stack over time, how would you do that? Another tag? I guess it's ok for image versions to coexist with the package versions, but they're not very discoverable.
  3. I could see a version of this where every Spack package ends up being an actual GitHub packages package in the web UI (i.e., here: https://github.com/haampie?tab=packages). The versioning becomes more straightforward that way... but pulling an entire buidlcache would be really complicated, right? Every "container" there corresponds to a GitHub repo right? So we really need to fit the entire builcache within one OCI container image in the registry? That makes me think this PR is doing the right thing.
  4. It seems like the UI for any file shows a docker pull command: https://github.com/haampie/spack-oci-buildcache-example/pkgs/container/spack-oci-buildcache-example/116359870?tag=index.spack. Is that just "what you get" when using OCI? I was wondering if maybe ORAS content-type tags would help. I looked at this: https://www.kenmuse.com/blog/universal-packages-on-github-with-oras/ but it doesn't really show you what that looks like in GH packages -- just what I looks like from the ORAS CLI. Curious whether @ChristianKniep has insights on this.

Anyway, all of those are just my personal observations, but one last question that matters more for review:

  1. How do we manage the OCI buildcache layout over time? Suppose I decide to change the way Spack OCI buildcaches are handled. How do current versions of Spack know that an OCI registry is too new, and how do new versions of Spack know that it is too old? It think we at least need a feature like this (something we've omitted in a few other places in Spack at our peril) to allow this feature to adapt over time.
  2. How is the buildcache format here, for OCI registries, tied to the regular filesystem buildcache format? Is that documented somewhere? If one changes, what happens to the other? How does that evolve over time? Probably there needs to be some dev documentation on how we've architected mirror backends (which is really what this is -- a new mirror backend that doesn't quite fit the traditional filesystem mold)

Ok those are my thoughts -- otherwise, this looks really good and I think it's close to complete. See below for further more detailed questions.


.. code-block:: console

$ spack mirror add --oci-username username --oci-password password my_registry oci://example.com/my_image
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure we should be storing clear passwords in config YAML. We've so far avoided having to do this... is there another way to do it for these registries?

Passwords are going to end up in weird places (like spack.yaml) and get pushed to public repos, and this is harder to control/understand with our layered config system.

I am not sure what a good way to do this is, but I looked around -- the two ideas that stood out to me were:

  1. Make it so that values of secret fields to be grabbed from env vars instead of encoded in the YAML, and make the value of the field an env var.
  2. Something like the !secret annotation here: https://www.home-assistant.io/docs/configuration/secrets/, where the value is !secret name_of_secret and name_of_secret is a key in some separate secrets.yaml that you can easily exclude from a git repo.
  3. Some variant of (2) where the user doesn't have to make something !secret -- we could do it in the schema.

I haven't really dug into this -- do you know of better approaches?

mirror_urls = [
url_util.join(mirror.fetch_url, rel_path)
for mirror in spack.mirror.MirrorCollection(source=True).values()
if not mirror.fetch_url.startswith("oci://")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems nasty -- shouldn't I be able to use an OCI mirror here? Why does stage care?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oci:// isn't tree based like http / ftp / s3 / etc, so url_util.join(fetch_url, rel_path) makes no sense.



def checksum(hashlib_algo, filename, **kwargs):
def checksum_fp(hashlib_algo, fp, *, block_size=2**20):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggest renaming to checksum_stream (or something more intuitive than fp?), also needs a docstring

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sprinkled some typehints around too

Comment on lines +1599 to +1686
# TODO: refactor this to some "nice" place.
if parsed.scheme == "oci":
ref = spack.oci.image.ImageReference.from_string(mirror[len("oci://") :]).with_tag(
spack.oci.image.default_tag(spec)
)

# Fetch the manifest
try:
response = spack.oci.opener.urlopen(
urllib.request.Request(
url=ref.manifest_url(),
headers={"Accept": "application/vnd.oci.image.manifest.v1+json"},
)
)
except Exception:
continue

# Download the config = spec.json and the relevant tarball
try:
manifest = json.loads(response.read())
spec_digest = spack.oci.image.Digest.from_string(manifest["config"]["digest"])
tarball_digest = spack.oci.image.Digest.from_string(
manifest["layers"][-1]["digest"]
)
except Exception:
continue

with spack.oci.oci.make_stage(
ref.blob_url(spec_digest), spec_digest, keep=True
) as local_specfile_stage:
try:
local_specfile_stage.fetch()
local_specfile_stage.check()
except Exception:
continue
local_specfile_stage.cache_local()

with spack.oci.oci.make_stage(
ref.blob_url(tarball_digest), tarball_digest, keep=True
) as tarball_stage:
try:
tarball_stage.fetch()
tarball_stage.check()
except Exception:
continue
tarball_stage.cache_local()

return {
"tarball_stage": tarball_stage,
"specfile_stage": local_specfile_stage,
"signature_verified": False,
}

else:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really like the two code paths here (I think you don't either)... is there a better way to factor this code? At the very least I'd separate the methods, or maybe stick the OCI part in the spack.oci module, but I'm not sure. It seems like there should be one single place for each mirror implementation... but it's kind of split now between build caches and source mirroring code.

Copy link
Copy Markdown
Member Author

@haampie haampie Oct 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking it would be easier to have them share code if they're more structurally similar, which requires an overhaul of the buildcache file structure.

I ran into a race condition in Gitlab CI, where two pipelines concurrently built the same spec, but produced a different tarball (so different shasum); the tarballs were uploaded from A first, then B, whereas spec.json was uploaded from B first, then A, so that spec.json listed the wrong shasum.

That can be avoided by actually making standard buildcache tarballs content-adressable, which mimics OCI structure. Then in the same situation as above we'd end up with two uploaded tarballs (one "dangling"), and a single (overwritten) spec.json, always listing a valid tarball shasum.

The assumption being that two builds of the same dag hash always produce an equivalent (but not necessarily bit-wise identical) tarball.

So, if buildcaches work like that, their (spec.json, tarball) pairs are pretty close to (manifest, layer) pairs from the OCI spec.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Decided in meeting that we should get OCI buildcache mostly as-is initially, then migrate the existing buildcache format to be content-addressed. So this can be a nice refactor later on, but doesn't prevent getting this in now.

@haampie
Copy link
Copy Markdown
Member Author

haampie commented Sep 6, 2023

Does it make sense the way every buildcache is an image, and the packages are versions of that image?

I'm following the OCI Distribution Spec v1.0, and that only defines one endpoint for listing things: https://github.com/opencontainers/distribution-spec/blob/v1.0/spec.md#content-discovery, so tags are the only way to reindex. The ORAS spec is not finalized, so I'm not using it. My impression was that ORAS needs support in the registry w.r.t. garbage collection of dangling blobs, potentially an OCI spec v1 registry could delete blobs still required under ORAS conventions (but I have to double check this).

I could imagine people wanting to use this not just as a buildcache but as a way to version a spack-built container image over time, but this overloads the tags for package versions. If you wanted to version a stack over time, how would you do that? Another tag? I guess it's ok for image versions to coexist with the package versions, but they're not very discoverable.

Not entirely following :p isn't this an issue with current buildcaches too? If you concretize --fresh the same environment at different times, but push to the same buildcache, you get tons of different flavors of the same package, and there's no way to know which packages are new/current.

It seems like the UI for any file shows a docker pull command:

Haven't really checked this, but the docker pull command does work ;p replacing it with spack (buildcache) install /xyz would also be nice, but would likely error in the majority of the cases right now, cause it requires the same os & compilers on the user's system

How do we manage the OCI buildcache layout over time?

I could add an annotation in the manifest file, but other things such as the tag name structure are hard to change, so we just have to get it right.

How is the buildcache format here, for OCI registries, tied to the regular filesystem buildcache format?

It's very different, I can document that. That's also the main reason the code is split, cause I cannot reuse stuff with computed urls (for spec.json / tarball) that assume a filesystem like structure, it's only using an OCI API.

@tgamblin tgamblin added this to the v0.21.0 milestone Oct 17, 2023
@haampie haampie force-pushed the feature/oci-buildcache branch from 891ec99 to 3922d4e Compare October 23, 2023 07:33
@haampie haampie force-pushed the feature/oci-buildcache branch 2 times, most recently from f5b7546 to a41fd76 Compare October 23, 2023 11:04
@haampie haampie force-pushed the feature/oci-buildcache branch from a41fd76 to 7b6033d Compare October 23, 2023 11:20
@haampie haampie requested a review from tgamblin October 23, 2023 12:39
@tgamblin tgamblin merged commit 195f965 into spack:develop Oct 27, 2023
@tgamblin
Copy link
Copy Markdown
Member

@haampie: LGTM! Can you please also add some documentation?

victoria-cherkas pushed a commit to victoria-cherkas/spack that referenced this pull request Oct 30, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

```console 
$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials
```

which should result in this config:

```yaml
mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]
```

It can be used like any other registry

```
spack buildcache push my_registry [specs...]
```

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with `docker pull` or `skopeo copy`.

In fact, a base image can be added to get a _runnable_ image:

```console
$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack
```

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
`push` commands faster, as well as `push` followed by a separate
`update-index` command.

An end to end example of how to use this in Github Actions is here:

**https://github.com/haampie/spack-oci-buildcache-example**


TODO:

- [x] Generate environment modifications in config so PATH is set up
- [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification)
- [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API)
- [x] Add logic to use object storage in an OCI registry in `spack install`.
- [x] Make the user pick the base image for generated OCI images.
- [x] Update buildcache install logic to deal with absolute paths in tarballs
- [x] Merge with `spack buildcache` command
- [x] Merge spack#37441 (included here)
- [x] Merge spack#39077 (included here)
- [x] spack#39187 + spack#39285
- [x] spack#39341
- [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images

NOTE:

1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported.
2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache.
3. The `push -j ...` flag only works for OCI buildcache, not for others
@haampie haampie deleted the feature/oci-buildcache branch October 30, 2023 12:20
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Nov 2, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

```console 
$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials
```

which should result in this config:

```yaml
mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]
```

It can be used like any other registry

```
spack buildcache push my_registry [specs...]
```

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with `docker pull` or `skopeo copy`.

In fact, a base image can be added to get a _runnable_ image:

```console
$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack
```

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
`push` commands faster, as well as `push` followed by a separate
`update-index` command.

An end to end example of how to use this in Github Actions is here:

**https://github.com/haampie/spack-oci-buildcache-example**


TODO:

- [x] Generate environment modifications in config so PATH is set up
- [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification)
- [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API)
- [x] Add logic to use object storage in an OCI registry in `spack install`.
- [x] Make the user pick the base image for generated OCI images.
- [x] Update buildcache install logic to deal with absolute paths in tarballs
- [x] Merge with `spack buildcache` command
- [x] Merge spack#37441 (included here)
- [x] Merge spack#39077 (included here)
- [x] spack#39187 + spack#39285
- [x] spack#39341
- [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images

NOTE:

1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported.
2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache.
3. The `push -j ...` flag only works for OCI buildcache, not for others
gabrielctn pushed a commit to gabrielctn/spack that referenced this pull request Nov 24, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

```console 
$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials
```

which should result in this config:

```yaml
mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]
```

It can be used like any other registry

```
spack buildcache push my_registry [specs...]
```

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with `docker pull` or `skopeo copy`.

In fact, a base image can be added to get a _runnable_ image:

```console
$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack
```

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
`push` commands faster, as well as `push` followed by a separate
`update-index` command.

An end to end example of how to use this in Github Actions is here:

**https://github.com/haampie/spack-oci-buildcache-example**


TODO:

- [x] Generate environment modifications in config so PATH is set up
- [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification)
- [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API)
- [x] Add logic to use object storage in an OCI registry in `spack install`.
- [x] Make the user pick the base image for generated OCI images.
- [x] Update buildcache install logic to deal with absolute paths in tarballs
- [x] Merge with `spack buildcache` command
- [x] Merge spack#37441 (included here)
- [x] Merge spack#39077 (included here)
- [x] spack#39187 + spack#39285
- [x] spack#39341
- [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images

NOTE:

1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported.
2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache.
3. The `push -j ...` flag only works for OCI buildcache, not for others
mtaillefumier pushed a commit to mtaillefumier/spack that referenced this pull request Dec 14, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

```console 
$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials
```

which should result in this config:

```yaml
mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]
```

It can be used like any other registry

```
spack buildcache push my_registry [specs...]
```

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with `docker pull` or `skopeo copy`.

In fact, a base image can be added to get a _runnable_ image:

```console
$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack
```

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
`push` commands faster, as well as `push` followed by a separate
`update-index` command.

An end to end example of how to use this in Github Actions is here:

**https://github.com/haampie/spack-oci-buildcache-example**


TODO:

- [x] Generate environment modifications in config so PATH is set up
- [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification)
- [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API)
- [x] Add logic to use object storage in an OCI registry in `spack install`.
- [x] Make the user pick the base image for generated OCI images.
- [x] Update buildcache install logic to deal with absolute paths in tarballs
- [x] Merge with `spack buildcache` command
- [x] Merge spack#37441 (included here)
- [x] Merge spack#39077 (included here)
- [x] spack#39187 + spack#39285
- [x] spack#39341
- [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images

NOTE:

1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported.
2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache.
3. The `push -j ...` flag only works for OCI buildcache, not for others
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Jan 11, 2024
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

```console
$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials
```

which should result in this config:

```yaml
mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]
```

It can be used like any other registry

```
spack buildcache push my_registry [specs...]
```

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with `docker pull` or `skopeo copy`.

In fact, a base image can be added to get a _runnable_ image:

```console
$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack
```

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
`push` commands faster, as well as `push` followed by a separate
`update-index` command.

An end to end example of how to use this in Github Actions is here:

**https://github.com/haampie/spack-oci-buildcache-example**

TODO:

- [x] Generate environment modifications in config so PATH is set up
- [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification)
- [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API)
- [x] Add logic to use object storage in an OCI registry in `spack install`.
- [x] Make the user pick the base image for generated OCI images.
- [x] Update buildcache install logic to deal with absolute paths in tarballs
- [x] Merge with `spack buildcache` command
- [x] Merge spack#37441 (included here)
- [x] Merge spack#39077 (included here)
- [x] spack#39187 + spack#39285
- [x] spack#39341
- [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images

NOTE:

1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported.
2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache.
3. The `push -j ...` flag only works for OCI buildcache, not for others
RikkiButler20 pushed a commit to RikkiButler20/spack that referenced this pull request Jan 31, 2024
Credits to @ChristianKniep for advocating the idea of OCI image layers
being identical to spack buildcache tarballs.

With this you can configure an OCI registry as a buildcache:

```console
$ spack mirror add my_registry oci://user/image # Dockerhub

$ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR

$ spack mirror set --push --oci-username ... --oci-password ... my_registry  # set login credentials
```

which should result in this config:

```yaml
mirrors:
  my_registry:
    url: oci://ghcr.io/haampie/spack-test
    push:
      access_pair: [<username>, <password>]
```

It can be used like any other registry

```
spack buildcache push my_registry [specs...]
```

It will upload the Spack tarballs in parallel, as well as manifest + config
files s.t. the binaries are compatible with `docker pull` or `skopeo copy`.

In fact, a base image can be added to get a _runnable_ image:

```console
$ spack buildcache push --base-image ubuntu:23.04 my_registry python
Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack

$ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack
```

which should really be a game changer for sharing binaries.

Further, all content-addressable blobs that are downloaded and verified
will be cached in Spack's download cache. This should make repeated
`push` commands faster, as well as `push` followed by a separate
`update-index` command.

An end to end example of how to use this in Github Actions is here:

**https://github.com/haampie/spack-oci-buildcache-example**

TODO:

- [x] Generate environment modifications in config so PATH is set up
- [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification)
- [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API)
- [x] Add logic to use object storage in an OCI registry in `spack install`.
- [x] Make the user pick the base image for generated OCI images.
- [x] Update buildcache install logic to deal with absolute paths in tarballs
- [x] Merge with `spack buildcache` command
- [x] Merge spack#37441 (included here)
- [x] Merge spack#39077 (included here)
- [x] spack#39187 + spack#39285
- [x] spack#39341
- [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images

NOTE:

1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported.
2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure.

CAVEATS:

1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools
2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache.
3. The `push -j ...` flag only works for OCI buildcache, not for others
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

binary-packages commands core PR affects Spack core functionality documentation Improvements or additions to documentation fetching new-command shell-support stage tests General test capability(ies) utilities versions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants