Volume Refactoring for plugin support by crosbymichael · Pull Request #13025 · moby/moby

crosbymichael · 2015-05-06T18:30:19Z

There were a few undefined behavior changes made in this PR to make things work better and not be as crazy. The order of precedence for how volumes are applied to a container are as follows.

Binds - I want my volumes here and not have docker mess with them
VolumesFrom - This container can get it's volumes from another container
Volumes - Docker manage these volumes from the image config

Another behavior change is that bind mounts cannot be inherited from volumes-from. I doubt this was ever expected and should have never worked.

The main areas of change are that there is not central volumes repository. The container's hold volume ID/Names of volumes that they consume along with how and where those volumes are mounted(dest,RO-RW). All other state regarding volumes is stored in the volume specific implementation.

For the default implementation we changed it to not require any on disk metadata about what containers are consuming a volume. When the daemon boots the container's are populated with the volume objects that they use and each volume is reference counted. Everything is done in memory and at runtime so we don't need a bunch of json files for this volume implementation.

Another design change is that bind mounts are not volumes. There were too many bool values on the volume struct to determine if it was a bind mounts or something else. We took the approach that a bind mount is the result of how a volume is consumed, therefore, we can treat the two completely separate.

What is missing
I still have to work on the migration code from the old volume format. This will be a little tricky but I should have a commit adding this soon.

cpuguy83 · 2015-05-06T18:57:21Z

So with this, there can be only 1 active volume driver? Is there a plan to extend this once we actually have more than 1?

LK4D4 · 2015-05-06T18:58:22Z

@cpuguy83 Yeah, we think about hook up driver right before volumeDriver call.
And probably record that plugin to container, to use it later.
I thought personally that drivers will be container fields :) So it's little breaking my plans.

LK4D4 · 2015-05-06T20:11:30Z

daemon/daemon.go

Little inconsistence, this variable called as volumeDriver(without s) everywhere else

icecrime · 2015-05-06T20:28:59Z

I'm obviously +1 for design, as the behavior changes come with a huge simplicity and readability boost.

crosbymichael · 2015-05-06T20:35:33Z

feel free to code review and i'll make changes based on feedback in a batch.

progrium · 2015-05-06T21:31:08Z

daemon/create.go

Maybe it should be "Driver" instead of "Plugin"?

How do you all want to call this internally?

Should be specific to the thing itself. It being an external plugin is abstracted away. It could be a local driver or a remote driver. It would use the same interface. So it would probably be named after how the interface is named. In this case, Driver.

I think plugin is okay. Because it is actually a plugin :)

Well do we want to change the interface name and all references to Plugin now?

But it might not be a plugin, right? If the value is local then it's the local driver...

"local plugin"?

It's the name of an object that implements the Driver interface, so whether it is the local driver or a driver from a plugin, it's the name of a the driver ... so Driver makes sense to me. Plugin doesn't just because ... it could be the local driver, right?

I don't understand. I would rather not have two concepts, is it a Driver or a Plugin?? Lets just pick one term and use it everywhere. I don't think we need two separate terms to describe the same logic that these entities provide.

Here's the thing. A plugin provides extensions of various types. It can provide a volume driver, a network driver, but maybe non-drivers, like maybe an event listener. These types are based on the interfaces they implement. A plugin is not just a driver or listener, it could be any. So the concept of plugin is just a thing that provides extensions in the form of implemented interfaces.

In this case, you're not even talking about a plugin, you're just talking about a driver. "What driver does this volume use?" Therefore, we call it Driver.

Imagine if there weren't plugins and just native Go pluggable things. What would you call it? Depends on the interface. Driver, Provider, etc

LK4D4 · 2015-05-06T21:59:41Z

daemon/create.go

What best way to get this value? We need to do this on each daemon.volumeDriver operation. Seems like it requires at least Volume and Container. So should it be like GetVolumeDriver(container, volume)?
Maybe makes sense to embed it to Volume interface?

Both places work fine. What do you think is better? A method on Volume that returns the DriverName() ?

Yeah, that would be perfect.

Now I think that Volumes runtime entity anyway and we can return volume.Driver directly for Volume to not seek it every time. Is it mad enough?

cpuguy83 · 2015-05-06T23:46:09Z

daemon/create.go

Nice, I wanted to do this, but didn't want to fragment container.Volumes and container.VolumesRW.
This is much better.

cpuguy83 · 2015-05-07T00:09:28Z

Overall this design LGTM, some questions on implementation.

crosbymichael · 2015-05-07T00:15:41Z

@cpuguy83 we have to do some migration to get the existing volumes out of vfs and the repository info into the new container field. Maybe we could work together on that?

thaJeztah · 2015-05-07T02:31:06Z

Another behavior change is that bind mounts cannot be inherited from volumes-from. I doubt this was ever expected and should have never worked.

If I understand correctly, this;

docker run -v /persistent/data:/data --name foo foo
docker run --volumes-from foo bar

Will no longer work? That sounds like a BIG change. I know many people are using bind-mounted volumes to either use pre-existing data, or to have control over the data, independent of the container lifecycle.

Personally, we use a number of utility images/containers that attach to running containers using --volumes-from to manipulate data in the running container. No longer being able to do that will be a blocking issue (and require a lot of extra tooling to achieve the same result without this functionality)

cpuguy83 · 2015-05-07T02:38:53Z

Actually yeah, I agree with that.

cpuguy83 · 2015-05-07T09:29:22Z

Sorry, some additional thoughts on this after some sleep:

Bind mounts are really really bad. They should not be 1st class citizens and their own separate thing.
The reason why bind mounts have bools is because we didn't have drivers. The implementation of volume drivers in #12862 also gets rid of these bool arguments. Bind mounted volumes are created by just another driver... if the user supplies a host path, docker knows that's the binds driver... and that's the only special case.

Allowing users to specify host paths to volume drivers also seems really really bad and only serves to promote the concept of binds... I can't think of a particularly good use case for it.

Originally was ok with names on volumes (b/c it's implemented as either having a user specified name or an ID), however I think names are also something that are likely to only be used by a subset of volume drivers... and even then those cases probably don't actually need it.
If instead of giving a name to a volume we pass the driver information about the consuming container, the driver can make it's own decisions on what to do (and the user can even specify a name for it in the container labels or other volume options).
Information I'd pass to the driver when creating (or flagging as being used by an additional container) are the container name, labels, image ID and tag... could be more but these are the pieces I see as the essenetial bits for a driver to make any special decision on what to do (e.g. "do I need to replicate data?", "is this volume for a container that was moved from another host", etc).

The reason for having volume configs on disk is primarily due to dangling volumes.
To use in-memory only information about volumes we'd have to make sure there is no such thing as a dangling volume. #10497

thaJeztah · 2015-05-07T13:11:21Z

Bind mounted volumes are created by just another driver... if the user supplies a host path, docker knows that's the binds driver... and that's the only special case.

+1. Wrt my earlier comment; if some (type of) volumes should be shared via --volumes-from and others not, maybe the VolumeConfig needs to contain more information, e.g. "opts" that can be set by the driver or an "isPrivate" bool, to decide whether or not the volume should be shared.

Originally was ok with names on volumes (b/c it's implemented as either having a user specified name or an ID), however I think names are also something that are likely to only be used by a subset of volume drivers... and even then those cases probably don't actually need it.

Are there still plans to have volumes as first-class citizens? I.e., being able to create/define a volume before it's being consumed by a container? Then I'd use the same approach as containers; create a volume and generate an ID. Optionally allow a name to be set for easier identification (otherwise a random name)

(Wondering if "labels" (metadata) are considered for volumes, but not sure if that's something to discuss in this stage)

crosbymichael · 2015-05-07T17:30:10Z

@thaJeztah that is why bind mounts are not volumes. And volumes-from for bind mounts should have never worked before. If you already know the bind mount source just use that instead of trying an implicit volumes-from.

cpuguy83 · 2015-05-07T18:20:18Z

@crosbymichael the thing is from a user perspective they are still volumes, ie. just a place to put data.
They are specified in the same place in the CLI (docker run -v).
If there was never support to create new dirs on the host as there is today, and they were specified through a separate flag I'd agree.

crosbymichael · 2015-05-07T18:45:41Z

they used to be different flags then they were merged but they are very different and it's not going to work with new plugins and such.

Signed-off-by: Michael Crosby <[email protected]> Volumes refactoring continued Signed-off-by: Arnaud Porterie <[email protected]>

cpuguy83 · 2015-05-08T12:24:28Z

@crosbymichael first pass at a migration is here: crosbymichael#36
However, this will probably break certain scenarios sense we need to move old vfs path to the new local one.
In #12862 volume drivers don't manage their own paths so this is not a problem and we just continue using the old data path.

calavera · 2015-05-08T18:55:28Z

daemon/daemon.go

all the logic to handle bind mounts should be wrapper into a big if hostConfig.Binds() > 0, from 1241 to 1272. Otherwise, volumes mounted with volumesFrom and volume can be overridden with an empty array when there are no bind mounts.

thaJeztah · 2015-05-16T00:27:15Z

@crosbymichael why was this closed? is it continued in #13161?

thaJeztah · 2015-05-21T09:11:16Z

For reference, this PR has been replaced by #13161

GordonTheTurtle added the status/0-triage label May 6, 2015

crosbymichael force-pushed the vfs-v2 branch from 72226ce to ad20d24 Compare May 6, 2015 19:00

tiborvass added status/1-design-review and removed status/0-triage labels May 6, 2015

crosbymichael force-pushed the vfs-v2 branch from ad20d24 to 9509ffa Compare May 6, 2015 20:05

LK4D4 reviewed May 6, 2015
View reviewed changes

daemon/daemon.go Outdated

Copy link

Contributor

LK4D4 May 6, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little inconsistence, this variable called as volumeDriver(without s) everywhere else

crosbymichael force-pushed the vfs-v2 branch from 9509ffa to a28d2aa Compare May 6, 2015 21:28

progrium reviewed May 6, 2015
View reviewed changes

crosbymichael force-pushed the vfs-v2 branch 2 times, most recently from 8fb1234 to c557882 Compare May 6, 2015 21:49

LK4D4 reviewed May 6, 2015
View reviewed changes

crosbymichael force-pushed the vfs-v2 branch from c557882 to 08e469b Compare May 6, 2015 23:18

cpuguy83 reviewed May 6, 2015
View reviewed changes

daemon/create.go

Copy link

Member

cpuguy83 May 6, 2015

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I wanted to do this, but didn't want to fragment container.Volumes and container.VolumesRW.
This is much better.

Implement docker local volume pluginImplement docker local volume

9721b11

Signed-off-by: Michael Crosby <[email protected]> Volumes refactoring continued Signed-off-by: Arnaud Porterie <[email protected]>

crosbymichael force-pushed the vfs-v2 branch from 08e469b to 9721b11 Compare May 7, 2015 23:34

calavera mentioned this pull request May 8, 2015

closes 11646 to fix volume path validations in server side #11878

Closed

thaJeztah mentioned this pull request May 8, 2015

Use labels instead of names to identify containers docker/compose#1356

Merged

calavera reviewed May 8, 2015
View reviewed changes

calavera mentioned this pull request May 12, 2015

Proposal: Volume refactor and external volume plugins #13161

Merged

3 tasks

jessfraz added this to the 1.7.0 milestone May 12, 2015

crosbymichael closed this May 15, 2015

Conversation

crosbymichael commented May 6, 2015

Uh oh!

cpuguy83 commented May 6, 2015

Uh oh!

LK4D4 commented May 6, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

icecrime commented May 6, 2015

Uh oh!

crosbymichael commented May 6, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cpuguy83 commented May 7, 2015

Uh oh!

crosbymichael commented May 7, 2015

Uh oh!

thaJeztah commented May 7, 2015

Uh oh!

cpuguy83 commented May 7, 2015

Uh oh!

cpuguy83 commented May 7, 2015

Uh oh!

thaJeztah commented May 7, 2015

Uh oh!

crosbymichael commented May 7, 2015

Uh oh!

cpuguy83 commented May 7, 2015

Uh oh!

crosbymichael commented May 7, 2015

Uh oh!

cpuguy83 commented May 8, 2015

Uh oh!

Choose a reason for hiding this comment

Uh oh!

thaJeztah commented May 16, 2015

Uh oh!

thaJeztah commented May 21, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants