Skip to content

docker cp to, from, and between containers#9871

Closed
jlhawn wants to merge 7 commits intomoby:masterfrom
jlhawn:docker_add
Closed

docker cp to, from, and between containers#9871
jlhawn wants to merge 7 commits intomoby:masterfrom
jlhawn:docker_add

Conversation

@jlhawn
Copy link
Contributor

@jlhawn jlhawn commented Jan 2, 2015

please check the first commit with documentation changes for details

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 2, 2015

related to #5846 #4272 and #8362

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/contaainer/container

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 2, 2015

current issues:

  • container paths must always be absolute
    • FIXED: relative paths are joined with the container's working directory
  • copying from volume root exposes vfs ID for dirname
    • FIXED: now correctly replaces with the mount point basename
  • check with issue from Fix matching for volume paths in cp #9789
    • FIXED: cherry-picked @cpuguy83's commit and updated appropriate cases.
  • if copying a directory, how do we determine if we want to copy the directory itself or just the contents of the directory?
    • FIXED: use . to signal directory contents, just like the cp utility.

@bfirsh
Copy link
Contributor

bfirsh commented Jan 2, 2015

Nice!

CLI docs also need updating, which might also be a good opportunity to improve the docker cp section.

I think the help text could explain that you can copy to/from/between containers, but it's quite hard to explain succinctly. Perhaps "Copy files/folders to and from containers, from SRCPATH to the DSTPATH".

@thaJeztah
Copy link
Member

I wonder why the API PUT needs a different endpoint (/containers/{name:.*}/copy-to) than GET (/containers/{name:.*}/copy-from).

Would it make sense to have a container's filesystem read/writable via a RESTful interface, i.e.;

GET/PUT/POST/DELETE /containers/{name:.*}/fs/path/to/file?

(Though that feels a bit like re-inventing existing thinks, such as swift or WebDAV)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Across

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 2, 2015

@bfirsh said:

CLI docs also need updating, which might also be a good opportunity to improve the docker cp section.

I think the help text could explain that you can copy to/from/between containers, but it's quite hard to explain succinctly. Perhaps "Copy files/folders to and from containers, from SRCPATH to the DSTPATH".

I'll add a commit which updates the docs and adds unit tests as soon as I get all of the behavior correct.
I've also updated the help string to be more like the one you suggested.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 2, 2015

@thaJeztah wrote:

I wonder why the API PUT needs a different endpoint (/containers/{name:.}/copy-to) than GET (/containers/{name:.}/copy-from).

Would it make sense to have a container's filesystem read/writable via a RESTful interface, i.e.;

GET/PUT/POST/DELETE /containers/{name:.*}/fs/path/to/file?

(Though that feels a bit like re-inventing existing thinks, such as swift or WebDAV)

There is currently a /containers/{name:.*}/copy endpoint that I don't want to change for backwards compatibility, however this endpoint currently uses the POST method so I could keep using the name copy and just use different methods (GET for copy-from and PUT for copy-to) but the copy-across method is still different though :(

I'm worried that putting the /path/to/file directly in the path would result in ambiguities, apparently the container name can be .* so it would grab the remainder of the path with a greedy match. Are these container names allowed to have path separators in them? and for copy-across this just doesn't seem possible either and we would have to use form data right?

Adding a DELETE method seems like we're getting ahead of ourselves too. Eventually (once your container rootfs has the rm binary) you can just rm anything.

@thaJeztah
Copy link
Member

I'm worried that putting the /path/to/file directly in the path would result in ambiguities, apparently the container name can be .*

Hm, then that problem should already be present when using the current end points (if my container is named foobar/copy?. Haven't checked if it is possible to create a container now that contains a ? as part of the name, but that seems like a very unwanted situation.

Basically, my idea would be to have a /fs/ endpoint, followed by a path and (for now) POST / GET to allow copy in two directions if the need arises, DELETE can be added in future. Not sure if that's technically feasible, but having different end points for "GET" and "POST" seemed a bit odd to me 😄

@LK4D4
Copy link
Contributor

LK4D4 commented Jan 2, 2015

Drone is really angry about this PR

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 2, 2015

Drone is really angry about this PR

yeah, I'm still trying to get the behavior to match the old cp command. There are issues where the container path is not absolute and when it's from a volume.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 3, 2015

@LK4D4 drone loves these tests now!

@jessfraz
Copy link
Contributor

jessfraz commented Jan 3, 2015

Btw this patch is awesome it behaves just like cp, and I love it!
I think we just need a few more integration-tests so the cp behavior
doesn't regress :)

On Friday, January 2, 2015, Josh Hawn <[email protected]
javascript:_e(%7B%7D,'cvml','[email protected]');> wrote:

@LK4D4 https://github.com/LK4D4 drone loves theses tests now!


Reply to this email directly or view it on GitHub
#9871 (comment).

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 3, 2015

I've just added some nice documentation/explanation to the first comment of this PR if anyone would like to take a look.

@timkaler
Copy link

timkaler commented Jan 3, 2015

this is great, thanks! I'm going to revisit using docker for my project now (currently using VMs, and workflow changes required to use containers weren't worth the benefits previously.)

@SvenDowideit
Copy link
Contributor

@jlhawn can you please move the documentation from the top comment, and into the documentation? (as proposals should be done docs PR first :) )

oh, and +1 !

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 5, 2015

@SvenDowideit Those are just docs for this PR. I haven't gotten to updating the actual docs yet. I'm working on a commit with unit tests and integration tests right now, then I'll finish off with a commit for the actual docs in the project ;-)

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 5, 2015

^ oh! and rebase it so that it looks like the docs PR came first ;-)

@SvenDowideit
Copy link
Contributor

@jlhawn mmm, The problem I have, is that if i read detailed PR comments first, then the docs diff, I may not be confused by the docs, because i have the PR knowledge already. once merged, the user doesn't have this extra context.

So I'd feel more comfortable with PR's having less detailed info, and thus everyone reviewing the PR will be reading the docs and tests to find out what the merged code will do.

@jlhawn jlhawn force-pushed the docker_add branch 3 times, most recently from 13e41d8 to ffa9127 Compare January 5, 2015 05:40
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be formatted as a **note** to stand out a bit more ?

@jlhawn jlhawn force-pushed the docker_add branch 5 times, most recently from dda92fc to d9acaf9 Compare January 14, 2015 17:36
Josh Hawn and others added 7 commits January 14, 2015 10:32
Documented changes to API to enable new `docker cp` behavior.

Added documentation on `docker cp` usage and behavior.

Fixed up the docs/man/README.md file a bit.

Docker-DCO-1.1-Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
Adds CopyFrom and CopyTo functions to be used for creating
archives for use with the new `docker cp` behavior.

Adds multiple test cases for the CopyFrom and CopyTo
functions in the pkg/archive package.

Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
Adds new jobs for handling copying to, from, or between containers.
To be used by the API for new endpoints making new `docker cp`
behavior possible.

Uses new pkg/archive and pkg/chrootarchive utilities for archiving
filesystem contents especially for this purpose.

Docker-DCO-1.1-Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
New endpoints have been added for copying to, from, and between
containers. These endpoints call out to jobs registered by the daemon
for handling each of these cases.

Each handler uses a common error handling function for errors from these
jobs before returning an error, if any.

To be used by CLI for changes to `docker cp`.

Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
Utilizes new Remote API endpoints to handle copying to, from, and between
containers in a manner that matches other common Unix utilities like
`cp` and `scp`.

Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
Adds several integration tests for `docker cp` behavior with over a dozen
tests for each of:

  container -> local
  local -> container
  container -> container

Also adds several more integration tests with `docker cp` when volumes are
involved in the paths which are being copied.

Signed-off-by: Josh Hawn <[email protected]> (github: jlhawn)
@RoelVdP
Copy link

RoelVdP commented Jan 15, 2015

@jlhawn Thank you for implementing this. Fantastic.

@tonistiigi
Copy link
Member

For anyone else interested (already showed it to @jlhawn yesterday), I made a proof of concept for an alternative API for this. API is basically simple get(similar to what docker currently does) and idempotent push. The exception cases for cp -a syntax are implemented as a transform stream that runs in the client side. https://github.com/tonistiigi/docker_cp_poc
Binaries for linux/osx: https://www.dropbox.com/sh/r0io7bgfwdi1xi1/AABD94YkJyX4itapc68Ks9Rea?dl=0

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 15, 2015

as @tonistiigi alluded above, I'll be reworking this soon (currently working on a few other things) to make the api changes for less complex logic on the server side. The changes would be:

  • GET /containers/(id)/copy

    archive the resource at the given path whether it's a directory, file, ends in . or ./, whatever.
    It shouldn't change much from how it currently works.

  • POST /containers/(id)/copy

    extract the archive in the payload to the given dstPath which MUST be a directory which exists. No
    crazy archive inspection, entry name replacement, or anything like that.

  • GET /containers/(id)/statpath

    this is a new endpoint which would return info on the given path argument inside the container like
    the type of file it is, permissions, ownership, etc, most of the things in the
    os.FileInfo interface.

    While this endpoint is not strictly required to implement cp -a behavior in the client (you could do what @tonistiigi did in his poc), it makes the process of checking for a path resource's existence and attributes a lot cleaner.

Together, these endpoints should give the client all of the necessary tools to cleanly implement cp -a behavior 👍

@duglin
Copy link
Contributor

duglin commented Jan 15, 2015

If there is just one resource, is there any reason we're not including the path to the resource in the URL itself? Then we may not even need an HTTP body on the GET and we could use HEAD for the query one. Just to be more RESTful.
To me: GET /containers/(id)/path/to/my/file would be a natural thing to do.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 15, 2015

@duglin putting the path in the url was something that was discussed above with @thaJeztah. The issue was that the container (id) in the path could possibly contain / characters too and could cause path matching ambiguity (this needs to be investigated further). There would be no body on the GET, just a path query parameter.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 17, 2015

Status Update! I've made a lot of progress on "Take 3" of this PR ;-) All of the tests are passing with the new implementation. I just need to update the docs. I should have it pushed some time this weekend.

@tonistiigi
Copy link
Member

@jlhawn Great news! Maybe consider making a new PR then and closing this one. This one is already hard to scroll and contains a lot of outdated information that may confuse new people looking at it.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 19, 2015

@tonistiigi @duglin et al, Everything is ready to go, I just need to split up some commits. Then I'll close this one and open up a shiny new 5-digit pull request.

@jlhawn
Copy link
Contributor Author

jlhawn commented Jan 19, 2015

Closed in favor of #10198

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.