-
Notifications
You must be signed in to change notification settings - Fork 808
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support writing image manifests and configurations in both oci and docker formats #118
Conversation
62735e1
to
2c6e529
Compare
cmd/buildah/bud.go
Outdated
@@ -130,6 +134,17 @@ func budCmd(c *cli.Context) error { | |||
if c.IsSet("file") || c.IsSet("f") { | |||
dockerfiles = c.StringSlice("file") | |||
} | |||
format := "oci" | |||
if c.IsSet("format") { | |||
format = c.String("format") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit, could call strings.ToLower() here, and make the code below look a little neeter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changing it.
image.go
Outdated
// suitable for specifying as a value of the PreferredManifestType | ||
// member of a CommitOptions structure. It is also the default. | ||
OCIv1ImageManifest = v1.MediaTypeImageManifest | ||
defaultManifestType = OCIv1ImageManifest |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure defaultManifestType adds any value here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not exported, so probably not. Dropping it.
image.go
Outdated
manifestType = mt | ||
break | ||
} | ||
if mt == docker.V2S2MediaTypeManifest { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not combine these together into one statement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No particularly good reason. Changing it.
b083a71
to
9b8b372
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, it might be worth considering whether to use the conversion mechanisms available in containers/image.
AFAICS both callers of Builder.initConfig
get the data from a types.Image
; so they could call types.Image.UpdatedImage
and use the converted data instead of getting the original data and converting in a separate code instance.
And containerImageSource
is only used as an input to copy.Image
, which already can do format conversion—so it should be much simpler to add an option to copy.Image
’s copy.Options
to forcibly convert to a specified format than to keep maintainiung the existing containerImageSource
which does everything twice (and supporting more formats in the future would mean doing everything even more times?).
OTOH I do appreciate that all of this buildah-specific conversion code has been written by now… so treat it just as a note that it would be possible, by no means blocking.
(Yeah, using types.Image.UpdatedImage
means gathering types.ManifestUpdateInformation
in order to do the conversion the way the original implementation does. buildah could just stub out some of that, perhaps, or some of that information could be made optional… OTOH the most annoying component, LayerDiffIDs
, really is necessary.)
image.go
Outdated
lastLayerDiffID := destHasher.Digest().String() | ||
image.RootFS.DiffIDs = append(image.RootFS.DiffIDs, lastLayerDiffID) | ||
oimage.RootFS.DiffIDs = append(oimage.RootFS.DiffIDs, lastLayerDiffID) | ||
dimage.RootFS.DiffIDs = append(dimage.RootFS.DiffIDs, destHasher.Digest()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DiffIDs
values should be digests of uncompressed data, AFAICS destHasher
digests the possibly compressed stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This has already been fixed to use srcHasher
.)
config.go
Outdated
for i := range manifest.FSLayers { | ||
d := manifest.FSLayers[i].BlobSum | ||
// Prepend this layer, because the original manifest's list has the earliest layer last. | ||
rootFS.DiffIDs = append([]digest.Digest{d}, rootFS.DiffIDs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that this is not correct in general: BlobSum is a digest of the possibly compressed version, DifffIDs of the uncompressed version. Is there a reason to expect the v2s1 image to only use uncompressed layers?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not an intentional limitation. I don't see the digest of the uncompressed data in the manifest, though. If we have to decompress or regenerate the layer data to get the right value, I'd be reluctant to do that. We compute the right values when we're exporting the image, so if nobody's going to be reading the blatantly-wrong-if-compression-was-used value, then I'd lean hard in the direction of adding a comment about it here and leaving it otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We compute the right values when we're exporting the image, so if nobody's going to be reading the blatantly-wrong-if-compression-was-used value, then I'd lean hard in the direction of adding a comment about it here and leaving it otherwise.
Works for me. (docker
’s pull path does enforce that the uncompressed data match the digests in DiffIDs
, so if it ever broke, it would be very visible.)
Or perhaps create an empty array, or an array of empty strings, to make extra certain that the incorrect values are never used by anything? Not necessary by any means.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling about it either way. If you do, I'm happy to change it to match.
config.go
Outdated
for i := range manifest.FSLayers { | ||
d := manifest.FSLayers[i].BlobSum | ||
// Prepend this layer, because the original manifest's list has the earliest layer last. | ||
rootFS.DiffIDs = append([]digest.Digest{d}, rootFS.DiffIDs...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original upstream code does not add V1Compatibility[].ThrowAway
layers to v2s2 DiffIDs
and manifest; I’m not sure that matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, so that explains the discrepancy in https://github.com/docker/distribution/blob/master/docs/spec/manifest-v2-1.md. Fixing to decode v1 compat data correctly.
Import more manifest and schema variables and types, and rename some of them to incorporate version numbers. Signed-off-by: Nalin Dahyabhai <[email protected]>
Updated parsing of I hadn't taken that close a look at format conversion in UpdatedImage(), so yeah, there's probably some duplication there. I think we need that level of control over the manifest and configuration contents, though. |
LGTM, to the very limited extent I understand the repo. |
LGTM as well |
config.go
Outdated
} | ||
history := []docker.V2S2History{} | ||
for i := range manifest.FSLayers { | ||
// The DiffID list is intended to contain the sums of _uncompressed_ blobs, so if the blob is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm struggling a bit with this comment and probably a lack of docker internals knowledge on my part. Could a blob be compressed and if so, is there a check that we need here before we prepend it? I'm not seeing a mention of compression here: https://github.com/docker/distribution/blob/master/docs/spec/manifest-v2-1.md, but don't know if that's the right spot to look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Blobs can be either data (such as a JSON-encoded config blob) or layers, and layers can be uncompressed or compressed. Compression is usually done gzip or bzip2. I expect bzip2 is rare (I've literally never seen it used) because the Go compress/bzip2 implementation can decompress but not compress. Most layers in the wild are compressed with gzip compression.
For Docker v2 schema 2, they're tagged with one of the MIME types listed in https://github.com/docker/distribution/blob/master/docs/spec/manifest-v2-2.md. In OCI v1, they're tagged with one of the MIME types listed at https://github.com/opencontainers/image-spec/blob/master/media-types.md. For Docker v2 schema 1, I think we don't have that information, but we can look at the initial segment of the blob and make a guess. Problem is that we don't want to take the time to decompress a possibly-very-large blob to calculate the correct value, so I'm starting to lean toward @mtrmac's suggestion that it's better to leave the DiffID list empty to avoid creating errors if we ever try to use this data.
@@ -108,6 +120,22 @@ type Config struct { | |||
Shell strslice.StrSlice `json:",omitempty"` // Shell for shell-form of RUN, CMD, ENTRYPOINT | |||
} | |||
|
|||
// github.com/docker/distribution/manifest/schema1/config_builder.go |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this location may have changed to github.com/docker/distribution/blob/master/manifest/schema1/config_builder.go. At least I'm finding the file there (added blob/master) and not at this location on the web.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The github web site's URL isn't going to perfectly mirror the tree layout that you get when cloning the repository, and I've been going for the latter here.
@@ -53,17 +68,31 @@ func (i *containerImageRef) NewImage(sc *types.SystemContext) (types.Image, erro | |||
} | |||
|
|||
func (i *containerImageRef) NewImageSource(sc *types.SystemContext, manifestTypes []string) (src types.ImageSource, err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When maintaining two years hence, searching for 'i' will be a bear through out this function. Any chance in your next pass through you could change this to at least 'img' to make it more markable? I'd really love 'containerImage', but willing to settle for at least a few more characters.
LGTM, a couple of possible touch ups for follow ons. |
Since V2S1 image manifests don't include a config blob, when we initialize a builder using an image with a V2S1 manifest, try to dig configuration information out of the manifest's layer history, and build a V2S2 image configuration using the results. Leave the diffID list unpopulated, because we compute the right values when we push the image, and to prevent us from mistakes if we ever try to use them before that in the future. Signed-off-by: Nalin Dahyabhai <[email protected]>
Build configuration and manifest data for both output formats that we support, and select one or the other based on the list of manifest types that would be accepted as a source image. Signed-off-by: Nalin Dahyabhai <[email protected]>
Make it possible to select the output format for Commit() and the imagebuildah package, and wire that through to a --format option in the CLI's "commit" and "bud" commands. Signed-off-by: Nalin Dahyabhai <[email protected]>
Add a test helper for examining image metadata and checking their types, and add tests that use it to verify that after writing either Docker v2 or OCI v1 images, that the manifest and configuration blobs that we stored for them successfully decode as the correct data types. Signed-off-by: Nalin Dahyabhai <[email protected]>
Test setting and importing of configuration settings by saving to both formats, and then checking that the values in both configuration fields has the right data when we use either image type as a source image. Signed-off-by: Nalin Dahyabhai <[email protected]>
Switched to leaving the diffID list empty when reading v2s1 images. |
Since V2S1 image manifests don't include a config blob, when we initialize a builder using an image with a V2S1 manifest, try to dig configuration information out of the manifest's layer history, and build a V2S2 image configuration using the results. Leave the diffID list unpopulated, because we compute the right values when we push the image, and to prevent us from mistakes if we ever try to use them before that in the future. Signed-off-by: Nalin Dahyabhai <[email protected]> Closes: #118 Approved by: rhatdan
Build configuration and manifest data for both output formats that we support, and select one or the other based on the list of manifest types that would be accepted as a source image. Signed-off-by: Nalin Dahyabhai <[email protected]> Closes: #118 Approved by: rhatdan
Make it possible to select the output format for Commit() and the imagebuildah package, and wire that through to a --format option in the CLI's "commit" and "bud" commands. Signed-off-by: Nalin Dahyabhai <[email protected]> Closes: #118 Approved by: rhatdan
Add a test helper for examining image metadata and checking their types, and add tests that use it to verify that after writing either Docker v2 or OCI v1 images, that the manifest and configuration blobs that we stored for them successfully decode as the correct data types. Signed-off-by: Nalin Dahyabhai <[email protected]> Closes: #118 Approved by: rhatdan
Test setting and importing of configuration settings by saving to both formats, and then checking that the values in both configuration fields has the right data when we use either image type as a source image. Signed-off-by: Nalin Dahyabhai <[email protected]> Closes: #118 Approved by: rhatdan
☀️ Test successful - status-redhatci |
Signed-off-by: Daniel J Walsh <[email protected]> Closes: #118 Approved by: mheon
This patch set adds
--format
options to thecommit
andbuild-using-dockerfile
commands (also-f
forcommit
) to allow selecting either docker v2s2 or oci v1 as output formats for the manifest and configuration blobs that we write to images. It also adds initial support for importing settings from docker v2s1 images, though that's not as well tested since we don't produce that format.This also helps us properly test the
config
command and the logic that imports settings from images in the two formats.