Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support writing image manifests and configurations in both oci and docker formats #118

Closed
wants to merge 6 commits into from

Conversation

nalind
Copy link
Member

@nalind nalind commented May 22, 2017

This patch set adds --format options to the commit and build-using-dockerfile commands (also -f for commit) to allow selecting either docker v2s2 or oci v1 as output formats for the manifest and configuration blobs that we write to images. It also adds initial support for importing settings from docker v2s1 images, though that's not as well tested since we don't produce that format.

This also helps us properly test the config command and the logic that imports settings from images in the two formats.

@nalind
Copy link
Member Author

nalind commented May 22, 2017

@nalind nalind force-pushed the formats branch 3 times, most recently from 62735e1 to 2c6e529 Compare May 22, 2017 20:32
@@ -130,6 +134,17 @@ func budCmd(c *cli.Context) error {
if c.IsSet("file") || c.IsSet("f") {
dockerfiles = c.StringSlice("file")
}
format := "oci"
if c.IsSet("format") {
format = c.String("format")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, could call strings.ToLower() here, and make the code below look a little neeter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing it.

image.go Outdated
// suitable for specifying as a value of the PreferredManifestType
// member of a CommitOptions structure. It is also the default.
OCIv1ImageManifest = v1.MediaTypeImageManifest
defaultManifestType = OCIv1ImageManifest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure defaultManifestType adds any value here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exported, so probably not. Dropping it.

image.go Outdated
manifestType = mt
break
}
if mt == docker.V2S2MediaTypeManifest {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not combine these together into one statement?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No particularly good reason. Changing it.

@rhatdan
Copy link
Member

rhatdan commented May 23, 2017

@mtrmac @runcom PTAL

@nalind nalind force-pushed the formats branch 5 times, most recently from b083a71 to 9b8b372 Compare May 23, 2017 19:26
Copy link
Contributor

@mtrmac mtrmac left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it might be worth considering whether to use the conversion mechanisms available in containers/image.

AFAICS both callers of Builder.initConfig get the data from a types.Image; so they could call types.Image.UpdatedImage and use the converted data instead of getting the original data and converting in a separate code instance.

And containerImageSource is only used as an input to copy.Image, which already can do format conversion—so it should be much simpler to add an option to copy.Image’s copy.Options to forcibly convert to a specified format than to keep maintainiung the existing containerImageSource which does everything twice (and supporting more formats in the future would mean doing everything even more times?).

OTOH I do appreciate that all of this buildah-specific conversion code has been written by now… so treat it just as a note that it would be possible, by no means blocking.

(Yeah, using types.Image.UpdatedImage means gathering types.ManifestUpdateInformation in order to do the conversion the way the original implementation does. buildah could just stub out some of that, perhaps, or some of that information could be made optional… OTOH the most annoying component, LayerDiffIDs, really is necessary.)

image.go Outdated
lastLayerDiffID := destHasher.Digest().String()
image.RootFS.DiffIDs = append(image.RootFS.DiffIDs, lastLayerDiffID)
oimage.RootFS.DiffIDs = append(oimage.RootFS.DiffIDs, lastLayerDiffID)
dimage.RootFS.DiffIDs = append(dimage.RootFS.DiffIDs, destHasher.Digest())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DiffIDs values should be digests of uncompressed data, AFAICS destHasher digests the possibly compressed stream.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This has already been fixed to use srcHasher.)

config.go Outdated
for i := range manifest.FSLayers {
d := manifest.FSLayers[i].BlobSum
// Prepend this layer, because the original manifest's list has the earliest layer last.
rootFS.DiffIDs = append([]digest.Digest{d}, rootFS.DiffIDs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Note that this is not correct in general: BlobSum is a digest of the possibly compressed version, DifffIDs of the uncompressed version. Is there a reason to expect the v2s1 image to only use uncompressed layers?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not an intentional limitation. I don't see the digest of the uncompressed data in the manifest, though. If we have to decompress or regenerate the layer data to get the right value, I'd be reluctant to do that. We compute the right values when we're exporting the image, so if nobody's going to be reading the blatantly-wrong-if-compression-was-used value, then I'd lean hard in the direction of adding a comment about it here and leaving it otherwise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We compute the right values when we're exporting the image, so if nobody's going to be reading the blatantly-wrong-if-compression-was-used value, then I'd lean hard in the direction of adding a comment about it here and leaving it otherwise.

Works for me. (docker’s pull path does enforce that the uncompressed data match the digests in DiffIDs, so if it ever broke, it would be very visible.)

Or perhaps create an empty array, or an array of empty strings, to make extra certain that the incorrect values are never used by anything? Not necessary by any means.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about it either way. If you do, I'm happy to change it to match.

config.go Outdated
for i := range manifest.FSLayers {
d := manifest.FSLayers[i].BlobSum
// Prepend this layer, because the original manifest's list has the earliest layer last.
rootFS.DiffIDs = append([]digest.Digest{d}, rootFS.DiffIDs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original upstream code does not add V1Compatibility[].ThrowAway layers to v2s2 DiffIDs and manifest; I’m not sure that matters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import more manifest and schema variables and types, and rename some of
them to incorporate version numbers.

Signed-off-by: Nalin Dahyabhai <[email protected]>
@nalind
Copy link
Member Author

nalind commented May 23, 2017

Updated parsing of v1Compatibility fields in schema 1 manifests to use the right type and preserve ThrowAway information.

I hadn't taken that close a look at format conversion in UpdatedImage(), so yeah, there's probably some duplication there. I think we need that level of control over the manifest and configuration contents, though.
The containerImageSource implementation also recomputes the checksums of the compressed layers, which we need to do in order to push the layers anywhere.

@mtrmac
Copy link
Contributor

mtrmac commented May 23, 2017

LGTM, to the very limited extent I understand the repo.

@runcom
Copy link
Member

runcom commented May 24, 2017

LGTM as well

config.go Outdated
}
history := []docker.V2S2History{}
for i := range manifest.FSLayers {
// The DiffID list is intended to contain the sums of _uncompressed_ blobs, so if the blob is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling a bit with this comment and probably a lack of docker internals knowledge on my part. Could a blob be compressed and if so, is there a check that we need here before we prepend it? I'm not seeing a mention of compression here: https://github.com/docker/distribution/blob/master/docs/spec/manifest-v2-1.md, but don't know if that's the right spot to look.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blobs can be either data (such as a JSON-encoded config blob) or layers, and layers can be uncompressed or compressed. Compression is usually done gzip or bzip2. I expect bzip2 is rare (I've literally never seen it used) because the Go compress/bzip2 implementation can decompress but not compress. Most layers in the wild are compressed with gzip compression.

For Docker v2 schema 2, they're tagged with one of the MIME types listed in https://github.com/docker/distribution/blob/master/docs/spec/manifest-v2-2.md. In OCI v1, they're tagged with one of the MIME types listed at https://github.com/opencontainers/image-spec/blob/master/media-types.md. For Docker v2 schema 1, I think we don't have that information, but we can look at the initial segment of the blob and make a guess. Problem is that we don't want to take the time to decompress a possibly-very-large blob to calculate the correct value, so I'm starting to lean toward @mtrmac's suggestion that it's better to leave the DiffID list empty to avoid creating errors if we ever try to use this data.

@@ -108,6 +120,22 @@ type Config struct {
Shell strslice.StrSlice `json:",omitempty"` // Shell for shell-form of RUN, CMD, ENTRYPOINT
}

// github.com/docker/distribution/manifest/schema1/config_builder.go
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this location may have changed to github.com/docker/distribution/blob/master/manifest/schema1/config_builder.go. At least I'm finding the file there (added blob/master) and not at this location on the web.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The github web site's URL isn't going to perfectly mirror the tree layout that you get when cloning the repository, and I've been going for the latter here.

@@ -53,17 +68,31 @@ func (i *containerImageRef) NewImage(sc *types.SystemContext) (types.Image, erro
}

func (i *containerImageRef) NewImageSource(sc *types.SystemContext, manifestTypes []string) (src types.ImageSource, err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When maintaining two years hence, searching for 'i' will be a bear through out this function. Any chance in your next pass through you could change this to at least 'img' to make it more markable? I'd really love 'containerImage', but willing to settle for at least a few more characters.

@TomSweeneyRedHat
Copy link
Member

LGTM, a couple of possible touch ups for follow ons.

nalind added 5 commits May 24, 2017 11:41
Since V2S1 image manifests don't include a config blob, when we
initialize a builder using an image with a V2S1 manifest, try to dig
configuration information out of the manifest's layer history, and build
a V2S2 image configuration using the results.

Leave the diffID list unpopulated, because we compute the right values
when we push the image, and to prevent us from mistakes if we ever try
to use them before that in the future.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Build configuration and manifest data for both output formats that we
support, and select one or the other based on the list of manifest types
that would be accepted as a source image.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Make it possible to select the output format for Commit() and the
imagebuildah package, and wire that through to a --format option in the
CLI's "commit" and "bud" commands.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Add a test helper for examining image metadata and checking their types,
and add tests that use it to verify that after writing either Docker v2
or OCI v1 images, that the manifest and configuration blobs that we
stored for them successfully decode as the correct data types.

Signed-off-by: Nalin Dahyabhai <[email protected]>
Test setting and importing of configuration settings by saving to both
formats, and then checking that the values in both configuration fields
has the right data when we use either image type as a source image.

Signed-off-by: Nalin Dahyabhai <[email protected]>
@nalind
Copy link
Member Author

nalind commented May 24, 2017

Switched to leaving the diffID list empty when reading v2s1 images.

@rhatdan
Copy link
Member

rhatdan commented May 24, 2017

@rh-atomic-bot r+ a3f4a14

@rh-atomic-bot
Copy link
Collaborator

⌛ Testing commit a3f4a14 with merge 05a7c97...

rh-atomic-bot pushed a commit that referenced this pull request May 24, 2017
Since V2S1 image manifests don't include a config blob, when we
initialize a builder using an image with a V2S1 manifest, try to dig
configuration information out of the manifest's layer history, and build
a V2S2 image configuration using the results.

Leave the diffID list unpopulated, because we compute the right values
when we push the image, and to prevent us from mistakes if we ever try
to use them before that in the future.

Signed-off-by: Nalin Dahyabhai <[email protected]>

Closes: #118
Approved by: rhatdan
rh-atomic-bot pushed a commit that referenced this pull request May 24, 2017
Build configuration and manifest data for both output formats that we
support, and select one or the other based on the list of manifest types
that would be accepted as a source image.

Signed-off-by: Nalin Dahyabhai <[email protected]>

Closes: #118
Approved by: rhatdan
rh-atomic-bot pushed a commit that referenced this pull request May 24, 2017
Make it possible to select the output format for Commit() and the
imagebuildah package, and wire that through to a --format option in the
CLI's "commit" and "bud" commands.

Signed-off-by: Nalin Dahyabhai <[email protected]>

Closes: #118
Approved by: rhatdan
rh-atomic-bot pushed a commit that referenced this pull request May 24, 2017
Add a test helper for examining image metadata and checking their types,
and add tests that use it to verify that after writing either Docker v2
or OCI v1 images, that the manifest and configuration blobs that we
stored for them successfully decode as the correct data types.

Signed-off-by: Nalin Dahyabhai <[email protected]>

Closes: #118
Approved by: rhatdan
rh-atomic-bot pushed a commit that referenced this pull request May 24, 2017
Test setting and importing of configuration settings by saving to both
formats, and then checking that the values in both configuration fields
has the right data when we use either image type as a source image.

Signed-off-by: Nalin Dahyabhai <[email protected]>

Closes: #118
Approved by: rhatdan
@rh-atomic-bot
Copy link
Collaborator

☀️ Test successful - status-redhatci
Approved by: rhatdan
Pushing 05a7c97 to master...

@nalind nalind deleted the formats branch June 13, 2017 15:14
nalind pushed a commit that referenced this pull request Apr 2, 2018
Signed-off-by: Daniel J Walsh <[email protected]>

Closes: #118
Approved by: mheon
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 31, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants