Skip to content

Volume Refactoring for plugin support#13025

Closed
crosbymichael wants to merge 1 commit intomoby:masterfrom
crosbymichael:vfs-v2
Closed

Volume Refactoring for plugin support#13025
crosbymichael wants to merge 1 commit intomoby:masterfrom
crosbymichael:vfs-v2

Conversation

@crosbymichael
Copy link
Contributor

There were a few undefined behavior changes made in this PR to make things work better and not be as crazy. The order of precedence for how volumes are applied to a container are as follows.

  1. Binds - I want my volumes here and not have docker mess with them
  2. VolumesFrom - This container can get it's volumes from another container
  3. Volumes - Docker manage these volumes from the image config

Another behavior change is that bind mounts cannot be inherited from volumes-from. I doubt this was ever expected and should have never worked.

The main areas of change are that there is not central volumes repository. The container's hold volume ID/Names of volumes that they consume along with how and where those volumes are mounted(dest,RO-RW). All other state regarding volumes is stored in the volume specific implementation.

For the default implementation we changed it to not require any on disk metadata about what containers are consuming a volume. When the daemon boots the container's are populated with the volume objects that they use and each volume is reference counted. Everything is done in memory and at runtime so we don't need a bunch of json files for this volume implementation.

Another design change is that bind mounts are not volumes. There were too many bool values on the volume struct to determine if it was a bind mounts or something else. We took the approach that a bind mount is the result of how a volume is consumed, therefore, we can treat the two completely separate.

What is missing
I still have to work on the migration code from the old volume format. This will be a little tricky but I should have a commit adding this soon.

@cpuguy83
Copy link
Member

cpuguy83 commented May 6, 2015

So with this, there can be only 1 active volume driver? Is there a plan to extend this once we actually have more than 1?

@LK4D4
Copy link
Contributor

LK4D4 commented May 6, 2015

@cpuguy83 Yeah, we think about hook up driver right before volumeDriver call.
And probably record that plugin to container, to use it later.
I thought personally that drivers will be container fields :) So it's little breaking my plans.

daemon/daemon.go Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Little inconsistence, this variable called as volumeDriver(without s) everywhere else

@icecrime
Copy link
Contributor

icecrime commented May 6, 2015

I'm obviously +1 for design, as the behavior changes come with a huge simplicity and readability boost.

@crosbymichael
Copy link
Contributor Author

feel free to code review and i'll make changes based on feedback in a batch.

daemon/create.go Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it should be "Driver" instead of "Plugin"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you all want to call this internally?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be specific to the thing itself. It being an external plugin is abstracted away. It could be a local driver or a remote driver. It would use the same interface. So it would probably be named after how the interface is named. In this case, Driver.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think plugin is okay. Because it is actually a plugin :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well do we want to change the interface name and all references to Plugin now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But it might not be a plugin, right? If the value is local then it's the local driver...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"local plugin"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the name of an object that implements the Driver interface, so whether it is the local driver or a driver from a plugin, it's the name of a the driver ... so Driver makes sense to me. Plugin doesn't just because ... it could be the local driver, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand. I would rather not have two concepts, is it a Driver or a Plugin?? Lets just pick one term and use it everywhere. I don't think we need two separate terms to describe the same logic that these entities provide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's the thing. A plugin provides extensions of various types. It can provide a volume driver, a network driver, but maybe non-drivers, like maybe an event listener. These types are based on the interfaces they implement. A plugin is not just a driver or listener, it could be any. So the concept of plugin is just a thing that provides extensions in the form of implemented interfaces.

In this case, you're not even talking about a plugin, you're just talking about a driver. "What driver does this volume use?" Therefore, we call it Driver.

Imagine if there weren't plugins and just native Go pluggable things. What would you call it? Depends on the interface. Driver, Provider, etc

@crosbymichael crosbymichael force-pushed the vfs-v2 branch 2 times, most recently from 8fb1234 to c557882 Compare May 6, 2015 21:49
daemon/create.go Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What best way to get this value? We need to do this on each daemon.volumeDriver operation. Seems like it requires at least Volume and Container. So should it be like GetVolumeDriver(container, volume)?
Maybe makes sense to embed it to Volume interface?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both places work fine. What do you think is better? A method on Volume that returns the DriverName() ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that would be perfect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I think that Volumes runtime entity anyway and we can return volume.Driver directly for Volume to not seek it every time. Is it mad enough?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I wanted to do this, but didn't want to fragment container.Volumes and container.VolumesRW.
This is much better.

@cpuguy83
Copy link
Member

cpuguy83 commented May 7, 2015

Overall this design LGTM, some questions on implementation.

@crosbymichael
Copy link
Contributor Author

@cpuguy83 we have to do some migration to get the existing volumes out of vfs and the repository info into the new container field. Maybe we could work together on that?

@thaJeztah
Copy link
Member

Another behavior change is that bind mounts cannot be inherited from volumes-from. I doubt this was ever expected and should have never worked.

If I understand correctly, this;

docker run -v /persistent/data:/data --name foo foo
docker run --volumes-from foo bar

Will no longer work? That sounds like a BIG change. I know many people are using bind-mounted volumes to either use pre-existing data, or to have control over the data, independent of the container lifecycle.

Personally, we use a number of utility images/containers that attach to running containers using --volumes-from to manipulate data in the running container. No longer being able to do that will be a blocking issue (and require a lot of extra tooling to achieve the same result without this functionality)

@cpuguy83
Copy link
Member

cpuguy83 commented May 7, 2015

Actually yeah, I agree with that.

@cpuguy83
Copy link
Member

cpuguy83 commented May 7, 2015

Sorry, some additional thoughts on this after some sleep:

Bind mounts are really really bad. They should not be 1st class citizens and their own separate thing.
The reason why bind mounts have bools is because we didn't have drivers. The implementation of volume drivers in #12862 also gets rid of these bool arguments. Bind mounted volumes are created by just another driver... if the user supplies a host path, docker knows that's the binds driver... and that's the only special case.

Allowing users to specify host paths to volume drivers also seems really really bad and only serves to promote the concept of binds... I can't think of a particularly good use case for it.

Originally was ok with names on volumes (b/c it's implemented as either having a user specified name or an ID), however I think names are also something that are likely to only be used by a subset of volume drivers... and even then those cases probably don't actually need it.
If instead of giving a name to a volume we pass the driver information about the consuming container, the driver can make it's own decisions on what to do (and the user can even specify a name for it in the container labels or other volume options).
Information I'd pass to the driver when creating (or flagging as being used by an additional container) are the container name, labels, image ID and tag... could be more but these are the pieces I see as the essenetial bits for a driver to make any special decision on what to do (e.g. "do I need to replicate data?", "is this volume for a container that was moved from another host", etc).

The reason for having volume configs on disk is primarily due to dangling volumes.
To use in-memory only information about volumes we'd have to make sure there is no such thing as a dangling volume. #10497

@thaJeztah
Copy link
Member

Bind mounted volumes are created by just another driver... if the user supplies a host path, docker knows that's the binds driver... and that's the only special case.

+1. Wrt my earlier comment; if some (type of) volumes should be shared via --volumes-from and others not, maybe the VolumeConfig needs to contain more information, e.g. "opts" that can be set by the driver or an "isPrivate" bool, to decide whether or not the volume should be shared.

Originally was ok with names on volumes (b/c it's implemented as either having a user specified name or an ID), however I think names are also something that are likely to only be used by a subset of volume drivers... and even then those cases probably don't actually need it.

Are there still plans to have volumes as first-class citizens? I.e., being able to create/define a volume before it's being consumed by a container? Then I'd use the same approach as containers; create a volume and generate an ID. Optionally allow a name to be set for easier identification (otherwise a random name)

(Wondering if "labels" (metadata) are considered for volumes, but not sure if that's something to discuss in this stage)

@crosbymichael
Copy link
Contributor Author

@thaJeztah that is why bind mounts are not volumes. And volumes-from for bind mounts should have never worked before. If you already know the bind mount source just use that instead of trying an implicit volumes-from.

@cpuguy83
Copy link
Member

cpuguy83 commented May 7, 2015

@crosbymichael the thing is from a user perspective they are still volumes, ie. just a place to put data.
They are specified in the same place in the CLI (docker run -v).
If there was never support to create new dirs on the host as there is today, and they were specified through a separate flag I'd agree.

@crosbymichael
Copy link
Contributor Author

they used to be different flags then they were merged but they are very different and it's not going to work with new plugins and such.

Signed-off-by: Michael Crosby <[email protected]>

Volumes refactoring continued

Signed-off-by: Arnaud Porterie <[email protected]>
@cpuguy83
Copy link
Member

cpuguy83 commented May 8, 2015

@crosbymichael first pass at a migration is here: crosbymichael#36
However, this will probably break certain scenarios sense we need to move old vfs path to the new local one.
In #12862 volume drivers don't manage their own paths so this is not a problem and we just continue using the old data path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all the logic to handle bind mounts should be wrapper into a big if hostConfig.Binds() > 0, from 1241 to 1272. Otherwise, volumes mounted with volumesFrom and volume can be overridden with an empty array when there are no bind mounts.

@thaJeztah
Copy link
Member

@crosbymichael why was this closed? is it continued in #13161?

@thaJeztah
Copy link
Member

For reference, this PR has been replaced by #13161

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants