Buildcache tarballs with rootfs structure#39341
Merged
alalazo merged 2 commits intospack:developfrom Oct 3, 2023
Merged
Conversation
f1fadf0 to
0e9d096
Compare
Two changes in this PR:
1. Register absolute paths in tarballs, which makes it easier
to use them as container image layers, or rootfs in general, outside
of Spack. Spack supports this already on develop.
2. Assemble the tarfile entries "by hand", which has a few advantages:
1. Avoid reading `/etc/passwd`, `/etc/groups`, `/etc/nsswitch.conf`
which `tar.add(dir)` does _for each file it adds_
2. Reduce the number of stat calls per file added by a factor two,
compared to `tar.add`, which should help with slow, shared filesystems
where these calls are expensive
4. Create normalized `TarInfo` entries from the start, instead of letting
Python create them and patching them after the fact
5. Don't recurse into subdirs before processing files, to avoid
keeping nested directories opened. (this changes the tar entry
order slightly, it's like sorting by `(not is_dir, name)`.
This change is forward incompatible, so old Spack 0.20 won't be able to
use tarballs generated with 0.21. (Unless spack#37441 is backported to 0.20)
0e9d096 to
37be51b
Compare
Member
Author
|
@kwryankrattiger maybe you wanna review? Seems like nobody is interested :( |
alalazo
reviewed
Oct 2, 2023
Member
alalazo
left a comment
There was a problem hiding this comment.
I went through the logic and LGTM. I have a couple of minor comments. I'll be testing this on some parallel filesystem next.
| Args: | ||
| tar: tarfile object to add files to | ||
| prefix: absolute install prefix of spec""" | ||
| assert os.path.isabs(prefix) and os.path.isdir(prefix) |
Member
There was a problem hiding this comment.
It would be good to have a message here, otherwise this will print:
==> Error:
if triggered.
Member
|
In terms of wall time, on an HPC cluster, develop shows: $ time spack buildcache create ~/tmp/develop libpng
==> Selected 17 specs to push to file:///g/g92/culpo1/tmp/develop
gpg: checking the trustdb
gpg: marginals needed: 3 completes needed: 1 trust model: pgp
gpg: depth: 0 valid: 1 signed: 0 trust: 0-, 0q, 0n, 0m, 0f, 1u
==> [ 1/17] Pushed [email protected]/rrjrgaw
==> [ 2/17] Pushed [email protected]/mjqcazt
==> [ 3/17] Pushed [email protected]/hsuxk52
==> [ 4/17] Pushed [email protected]/upwiaol
==> [ 5/17] Pushed [email protected]/itbpeol
==> [ 6/17] Pushed [email protected]/d542ymy
==> [ 7/17] Pushed ca-certificates-mozilla@2023-05-30/kuxdrfi
==> [ 8/17] Pushed [email protected]/kvputle
==> [ 9/17] Pushed [email protected]/oc4h6k6
==> [10/17] Pushed [email protected]/64ouqrb
==> [11/17] Pushed [email protected]/zh67wnb
==> [12/17] Pushed [email protected]/b5olrx4
==> [13/17] Pushed [email protected]/xcqqjcq
==> [14/17] Pushed [email protected]/jns5mse
==> [15/17] Pushed [email protected]/skzoqbs
==> [16/17] Pushed [email protected]/qyygqi4
==> [17/17] Pushed [email protected]/kduz2h6
real 1m35,619s
user 1m3,254s
sys 0m15,350swhile this PR shows: $ time spack buildcache create ~/tmp/pr libpng
==> Selected 17 specs to push to file:///g/g92/culpo1/tmp/pr
==> [ 1/17] Pushed [email protected]/rrjrgaw
==> [ 2/17] Pushed [email protected]/mjqcazt
==> [ 3/17] Pushed [email protected]/hsuxk52
==> [ 4/17] Pushed [email protected]/upwiaol
==> [ 5/17] Pushed [email protected]/itbpeol
==> [ 6/17] Pushed [email protected]/d542ymy
==> [ 7/17] Pushed ca-certificates-mozilla@2023-05-30/kuxdrfi
==> [ 8/17] Pushed [email protected]/kvputle
==> [ 9/17] Pushed [email protected]/oc4h6k6
==> [10/17] Pushed [email protected]/64ouqrb
==> [11/17] Pushed [email protected]/zh67wnb
==> [12/17] Pushed [email protected]/b5olrx4
==> [13/17] Pushed [email protected]/xcqqjcq
==> [14/17] Pushed [email protected]/jns5mse
==> [15/17] Pushed [email protected]/skzoqbs
==> [16/17] Pushed [email protected]/qyygqi4
==> [17/17] Pushed [email protected]/kduz2h6
real 1m21,050s
user 0m53,942s
sys 0m13,770sso a |
alalazo
approved these changes
Oct 3, 2023
Member
|
Force merged, since cray-sles is down |
AdhocMan
pushed a commit
to AdhocMan/spack
that referenced
this pull request
Oct 10, 2023
Two changes in this PR:
1. Register absolute paths in tarballs, which makes it easier
to use them as container image layers, or rootfs in general, outside
of Spack. Spack supports this already on develop.
2. Assemble the tarfile entries "by hand", which has a few advantages:
1. Avoid reading `/etc/passwd`, `/etc/groups`, `/etc/nsswitch.conf`
which `tar.add(dir)` does _for each file it adds_
2. Reduce the number of stat calls per file added by a factor two,
compared to `tar.add`, which should help with slow, shared filesystems
where these calls are expensive
4. Create normalized `TarInfo` entries from the start, instead of letting
Python create them and patching them after the fact
5. Don't recurse into subdirs before processing files, to avoid
keeping nested directories opened. (this changes the tar entry
order slightly, it's like sorting by `(not is_dir, name)`.
tgamblin
pushed a commit
that referenced
this pull request
Oct 27, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge #37441 (included here) - [x] Merge #39077 (included here) - [x] #39187 + #39285 - [x] #39341 - [x] Not a blocker: #35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
victoria-cherkas
pushed a commit
to victoria-cherkas/spack
that referenced
this pull request
Oct 30, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
RikkiButler20
pushed a commit
to RikkiButler20/spack
that referenced
this pull request
Nov 2, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
gabrielctn
pushed a commit
to gabrielctn/spack
that referenced
this pull request
Nov 24, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
mtaillefumier
pushed a commit
to mtaillefumier/spack
that referenced
this pull request
Dec 14, 2023
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
RikkiButler20
pushed a commit
to RikkiButler20/spack
that referenced
this pull request
Jan 11, 2024
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
RikkiButler20
pushed a commit
to RikkiButler20/spack
that referenced
this pull request
Jan 31, 2024
Credits to @ChristianKniep for advocating the idea of OCI image layers being identical to spack buildcache tarballs. With this you can configure an OCI registry as a buildcache: ```console $ spack mirror add my_registry oci://user/image # Dockerhub $ spack mirror add my_registry oci://ghcr.io/haampie/spack-test # GHCR $ spack mirror set --push --oci-username ... --oci-password ... my_registry # set login credentials ``` which should result in this config: ```yaml mirrors: my_registry: url: oci://ghcr.io/haampie/spack-test push: access_pair: [<username>, <password>] ``` It can be used like any other registry ``` spack buildcache push my_registry [specs...] ``` It will upload the Spack tarballs in parallel, as well as manifest + config files s.t. the binaries are compatible with `docker pull` or `skopeo copy`. In fact, a base image can be added to get a _runnable_ image: ```console $ spack buildcache push --base-image ubuntu:23.04 my_registry python Pushed ... as [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack $ docker run --rm -it [image]:python-3.11.2-65txfcpqbmpawclvtasuog4yzmxwaoia.spack ``` which should really be a game changer for sharing binaries. Further, all content-addressable blobs that are downloaded and verified will be cached in Spack's download cache. This should make repeated `push` commands faster, as well as `push` followed by a separate `update-index` command. An end to end example of how to use this in Github Actions is here: **https://github.com/haampie/spack-oci-buildcache-example** TODO: - [x] Generate environment modifications in config so PATH is set up - [x] Enrich config with Spack's `spec` json (this is allowed in the OCI specification) - [x] When ^ is done, add logic to create an index in say `<image>:index` by fetching all config files (using OCI distribution discovery API) - [x] Add logic to use object storage in an OCI registry in `spack install`. - [x] Make the user pick the base image for generated OCI images. - [x] Update buildcache install logic to deal with absolute paths in tarballs - [x] Merge with `spack buildcache` command - [x] Merge spack#37441 (included here) - [x] Merge spack#39077 (included here) - [x] spack#39187 + spack#39285 - [x] spack#39341 - [x] Not a blocker: spack#35737 fixes correctness run env for the generated container images NOTE: 1. `oci://` is unfortunately taken, so it's being abused in this PR to mean "oci type mirror". `skopeo` uses `docker://` which I'd like to avoid, given that classical docker v1 registries are not supported. 2. this is currently `https`-only, given that basic auth is used to login. I _could_ be convinced to allow http, but I'd prefer not to, given that for a `spack buildcache push` command multiple domains can be involved (auth server, source of base image, destination registry). Right now, no urllib http handler is added, so redirects to https and auth servers with http urls will simply result in a hard failure. CAVEATS: 1. Signing is not implemented in this PR. `gpg --clearsign` is not the nicest solution, since (a) the spec.json is merged into the image config, which must be valid json, and (b) it would be better to sign the manifest (referencing both config/spec file and tarball) using more conventional image signing tools 2. `spack.binary_distribution.push` is not yet implemented for the OCI buildcache, only `spack buildcache push` is. This is because I'd like to always push images + deps to the registry, so that it's `docker pull`-able, whereas in `spack ci` we really wanna push an individual package without its deps to say `pr-xyz`, while its deps reside in some `develop` buildcache. 3. The `push -j ...` flag only works for OCI buildcache, not for others
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two changes in this PR:
to use them as container image layers, or rootfs in general, outside
of Spack where no relocation can be done. Spack supports installing
tarballs of this type already on develop. Tar file size would be
a bit larger, but those repeated prefixes compress well. In practice
I'm getting 0.4% size difference, so negligible.
/etc/passwd,/etc/groups,/etc/nsswitch.confwhich
tar.add(dir)does for each file it addscompared to
tar.add, which should help with slow, shared filesystemswhere these calls are expensive
TarInfoentries from the start, instead of lettingPython create them and patching them after the fact
keeping nested directories opened. This changes the tar entry
order slightly, it's like sorting by
(is_dir, name)instead ofnamePart 1 is necessary for #38358
This change is forward incompatible, so old Spack 0.20 won't be able to
use tarballs generated with 0.21. (Unless #37441 is backported to 0.20)
* the leading
/is stripped, following GNU tar defaults, so technically thepaths are relative to the root
/and not absolute.Creating a tarball from a [email protected] install
Measured using
on an NVMe disk, so probably multiply the seconds column by a factor 100-1000 to translate to HPC shared filesystem performance ;)
Before this PR
After this
This includes (common) startup of Python itself... even then 2.7x fewer stat calls, and
readdominating the syscalls as it should.writeis of course small because of compression.