Skip to content

client: add Import() and Export() for importing/exporting image in OCI format#1013

Merged
crosbymichael merged 1 commit intocontainerd:masterfrom
AkihiroSuda:oci-export
Jul 28, 2017
Merged

client: add Import() and Export() for importing/exporting image in OCI format#1013
crosbymichael merged 1 commit intocontainerd:masterfrom
AkihiroSuda:oci-export

Conversation

@AkihiroSuda
Copy link
Copy Markdown
Member

@AkihiroSuda AkihiroSuda commented Jun 15, 2017

Part of #831

Export as a tar (Note: "-" can be used for stdout):

$ ctr images export /tmp/oci-busybox.tar docker.io/library/busybox:latest
$ tar tvf /tmp/oci-busybox.tar
drwxr-xr-x 0/0               0 1970-01-01 00:00 blobs/
drwxr-xr-x 0/0               0 1970-01-01 00:00 blobs/sha256/
-r--r--r-- 0/0              30 1970-01-01 00:00 oci-layout
-r--r--r-- 0/0             527 1970-01-01 00:00 blobs/sha256/c79345819a6882c31b41bc771d9a94fc52872fa651b36771fbe0c8461d7ee558
-r--r--r-- 0/0            1507 1970-01-01 00:00 blobs/sha256/c75bebcdd211f41b3a460c7bf82970ed6c75acaab9cd4c9a4e125b03ca113798
-r--r--r-- 0/0          699311 1970-01-01 00:00 blobs/sha256/1cae461a1479c5a24dd38bd5f377ce65f531399a7db8c3ece891ac2197173f1d
-rw-r--r-- 0/0             257 1970-01-01 00:00 index.json

Import a tar (Note: "-" can be used for stdin):

$ ctr images import foo/new:latest /tmp/oci-busybox.tar
unpacking sha256:c79345819a6882c31b41bc771d9a94fc52872fa651b36771fbe0c8461d7ee558...done

Note:

Signed-off-by: Akihiro Suda [email protected]

@codecov-io
Copy link
Copy Markdown

codecov-io commented Jun 15, 2017

Codecov Report

Merging #1013 into master will decrease coverage by 1.05%.
The diff coverage is 0%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1013      +/-   ##
==========================================
- Coverage   24.69%   23.64%   -1.06%     
==========================================
  Files          38       40       +2     
  Lines        4317     4509     +192     
==========================================
  Hits         1066     1066              
- Misses       3049     3241     +192     
  Partials      202      202
Impacted Files Coverage Δ
client.go 2.55% <0%> (-0.49%) ⬇️
import.go 0% <0%> (ø)
export.go 0% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1a054c6...b518f11. Read the comment docs.

Comment thread client.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just don't add this until its implemented

Comment thread client.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, don't add options until you are going to implement them

Comment thread client.go Outdated
Copy link
Copy Markdown
Member

@crosbymichael crosbymichael Jun 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of taking a path, would it be better to have export work over a reader?

Export() (io.ReadCloser, error)?

This way, clients have more flexibility for how they handle the exported content, they can push it over a stream or write it to disk at their path. Right now this client is not very flexible when you take a path like this

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are going to use TAR (correct me if misunderstanding) for Export(),
Import() would use TAR as well: Import(ctx context.Context, ref string, tarr io.ReadCloser, opts ...ImportOpt)(Image, error).

However, the implementation for Import() is likely to be inefficient, because TAR cannot be read randomly.

So my suggestion is to use directory by default across both Export() and Import().
We can still add WithOCITarExportation(tarr io.Reader) ExportOpt for tar support.

WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, directory representation might be useful for Export() as well, because we can export blobs in parallel. (Unlike generic filesystem IO operation, this would be CPU-intensive for digest computation and good for multi-threading, but I didn't do benchmark 😅 )

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the usecase you are solving for because I think you are underestimating how inefficient the on disk format is.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Image distribution using NFS/S3/SSHFS/IPFS/whatever.

tar cf /mnt/nfs-or-whatever/some-image | dist import foo/some-image:some-ref-name --ref-name some-ref-name - would cause I/O for unneeded blobs, unless tar is aware of OCI format and the ref name being used.

Also, dist cannot read index.json without scanning whole the "tape".

@AkihiroSuda AkihiroSuda changed the title client: add Export() for exporting image in OCI format client: add Import() and Export() for importing/exporting image in OCI format Jun 26, 2017
@AkihiroSuda
Copy link
Copy Markdown
Member Author

Implemented both Import/Export for both directory representation and TAR representation.

@crosbymichael
Could you please try?
Usage are added to the description text of this PR.

@crosbymichael
Copy link
Copy Markdown
Member

Will review, thanks!

@crosbymichael
Copy link
Copy Markdown
Member

In your changes you didn't address the feedback. You are still outputting files to disk instead of returning/consuming an io.Reader. Its usually bad design to have packages output files to disk that consumers are using. Sometimes you don't have a disk or are working via a remote API. Returning streams from the client is the best design to allow users to consume the results however they want. You can even just extract it to disk if you want once you have reader.

@AkihiroSuda
Copy link
Copy Markdown
Member Author

@crosbymichael
I added dist export --representation oci+tar according to your feedback.
This oci+tar exportation works as you expect (though it receives an io.Writer rather than returning io.Reader).

I can remove extra dist export --representation oci+directory, but it might be asymmetric to dist import, which I think should still support oci+directory for efficiency #1013 (comment)

WDYT?

@crosbymichael
Copy link
Copy Markdown
Member

That is fine for the cli tools but I meant the method signatures for the *Client should be.

func (c *Client) Import(context.Context, string, io.Reader) error
func (c *Client) Export(context.Context, Descriptor) (io.ReadCloser, error)

This way the client is useful by anyone, local, over a socket, etc and the CLI can save the tar, extract to a directory, etc.

@crosbymichael
Copy link
Copy Markdown
Member

Its extremely hard to mock, test, use generally when packages write to files on disk internally.

@AkihiroSuda AkihiroSuda force-pushed the oci-export branch 4 times, most recently from 18a3168 to e5482bd Compare June 29, 2017 06:57
@AkihiroSuda
Copy link
Copy Markdown
Member Author

OK, changed API as follows:

// Import imports an image from a Tar stream using reader.
// OCI format is assumed by default.
//
// Note that unreferenced blobs are imported to the content store as well.
func (c *Client) Import(ctx context.Context, ref string, reader io.Reader, opts ...ImportOpt) (Image, error)
// ImportDirectory imports an image from a directory using opener.
// OCI format is assumed by default.
//
// Unlike Import(), only blobs referenced by the manifest are imported to the content store.
//
// opener is typically `func(relPath string)(io.ReadCloser, error){ return os.Open(filepath.Join(someDir, relPath)) }`,
// but it does not need to use real filesystem.
func (c *Client) ImportDirectory(ctx context.Context, ref string, opener func(string) (io.ReadCloser, error), opts ...ImportOpt) (Image, error)
// Export exports an image to a Tar stream.
// OCI format is used by default.
// It is up to caller to put "org.opencontainers.image.ref.name" annotation to desc.
func (c *Client) Export(ctx context.Context, desc ocispec.Descriptor, opts ...ExportOpt) (io.ReadCloser, error)
// No `ExportDirectory`

Comment thread oci/dir.go Outdated

// Directory is ImageDriver for directory representation of an OCI image.
// The directory must not exist before calling Init function of this structure.
func Directory(path string) ImageDriver {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just note that this directory ImageDriver is not used at all in containerd, but can be used for 3rd party projects.
(My https://github.com/AkihiroSuda/filegrain is going to vendor this)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we were removing this.

It would be good to put together a proposal for directory based layouts. This use case really doesn't make sense. The OCI layout is meant for an archive.

Comment thread oci/oci.go
//
// img directory must not exist before calling this function.
// BlobWriter writes an OCI blob and returns a digest when closed.
type BlobWriter interface {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks eerily similar to content.Writer. Let's avoid duplication.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content package is internal detail of containerd while oci package is specific to OCI format.
So I'm not sure we should deduplicate, but may I try that in follow-up PR?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. future content package may use non-filesystem blob store, or just use sharded directory (blobs/sha256/de/deadbeef)
Then it could not be deduplicated

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda The content package is the premiere approach for doing io on OCI oriented content. Use the interface definition, don't define a new one that is slightly different.

Comment thread client.go Outdated
//
// opener is typically `func(relPath string)(io.ReadCloser, error){ return os.Open(filepath.Join(someDir, relPath)) }`,
// but it does not need to use real filesystem.
func (c *Client) ImportDirectory(ctx context.Context, ref string, opener func(string) (io.ReadCloser, error), opts ...ImportOpt) (Image, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this directory layout format is optimal, nor am I sure that handling directories and tarfiles with the same code path is the best for consistent implementations.

Is there any way we could split Import/Export and ImportDirectory into two PRs?

@AkihiroSuda
Copy link
Copy Markdown
Member Author

@stevvooe @crosbymichael

Eliminated ImportDirectory.
Now the PR solely works with tar stream.

I'll open follow-up PR for reviving ImportDirectory.
import-directory.patch.txt

@AkihiroSuda
Copy link
Copy Markdown
Member Author

@stevvooe

Removed oci/dir.go and deduplicated BlobWriter interface. PTAL.

However, I do still think OCI is not limited to be used solely for archival purpose, let me discuss that in follow-up PR:
https://github.com/opencontainers/image-spec/blob/master/image-layout.md

This layout MAY be used in a variety of different transport mechanisms: archive formats (e.g. tar, zip), shared filesystem environments (e.g. nfs), or networked file fetching (e.g. http, ftp, rsync).

@stevvooe
Copy link
Copy Markdown
Member

stevvooe commented Jul 5, 2017

@AkihiroSuda Thanks!

That document is making a leap to say that the layout can be used for "shared filesystem environments (e.g. nfs), or networked file fetching (e.g. http, ftp, rsync)." The layout makes no provisions for concurrency, updates or security in such environments.

Comment thread oci/tar.go Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filepath.ToSlash?

Comment thread oci/oci.go Outdated
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unexpected size?

@AkihiroSuda
Copy link
Copy Markdown
Member Author

@aaronlehmann
Thank you, didn't know that there is filepath.ToSlash.

Updated PR

@stevvooe
Copy link
Copy Markdown
Member

@AkihiroSuda Looks like this needs another rebase.

@AkihiroSuda
Copy link
Copy Markdown
Member Author

Rebased. Now CLI commands are ctr import and ctr export.
PTAL.

@stevvooe
Copy link
Copy Markdown
Member

@AkihiroSuda Should be image import/export?

@AkihiroSuda
Copy link
Copy Markdown
Member Author

done @stevvooe

Comment thread client.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't format be resolved by the incoming file?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could be easily resolved if we can add support for directories.
However, for tar, we can't resolve the format until reading whole the "tape" to find oci-layout.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think directories solves that problem. Would it make sense to pre-unpack the format then resolve the version?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work with a foreign format other than OCI v1?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the most part, any archival format should be safe for pre-unpack, to some degree. However, what happens when there is a new oci layout format?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future OCI would need to be specified manually with WithOCIv2ImportFormat(), if it is incompatible to v1

Comment thread client.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker save/load support?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you know Docker will very soonly support save/load in OCI format. moby/moby#33355

Do you want the classic save/load support in containerd?
(Can be another PR?)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want the classic save/load support in containerd?

Yes. OCI save/load support may not be available for a few versions.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but let me do that in a follow-up PR 😅

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

File an issue?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@stevvooe
Copy link
Copy Markdown
Member

LGTM

PTAL @dmcgowan

@AkihiroSuda
Copy link
Copy Markdown
Member Author

rebased, @dmcgowan @crosbymichael @mlaventure PTAL?

…I format

Export as a tar (Note: "-" can be used for stdout):

    $ ctr images export /tmp/oci-busybox.tar docker.io/library/busybox:latest

Import a tar (Note: "-" can be used for stdin):

    $ ctr images import foo/new:latest /tmp/oci-busybox.tar

Note: media types are not converted at the moment: e.g.
  application/vnd.docker.image.rootfs.diff.tar.gzip
  -> application/vnd.oci.image.layer.v1.tar+gzip

Signed-off-by: Akihiro Suda <[email protected]>
@AkihiroSuda
Copy link
Copy Markdown
Member Author

CI green

Copy link
Copy Markdown
Member

@crosbymichael crosbymichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crosbymichael crosbymichael merged commit 183a6ca into containerd:master Jul 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants