-
Notifications
You must be signed in to change notification settings - Fork 18.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
devicemapper: remove container rootfs mountPath after umount #34573
devicemapper: remove container rootfs mountPath after umount #34573
Conversation
41b98d1
to
a7f8155
Compare
9934083
to
a875ebf
Compare
We should have separate suites for testing the graph drivers, so we can run the suite for any supported drivers. |
integration/util/kernel.go
Outdated
// -1 if lval < rval | ||
// 0 if lval = rval | ||
// 1 if lval > rval | ||
func MustKernelCmp(rval kernel.VersionInfo) int { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this be inlined into run_test.go
? Do you expect to use it outside of container tests?
integration/container/run_test.go
Outdated
// tested ocurred). We run it anyway just to be safe. | ||
driver, _ := daemonStorageDriver(context.Background(), client) | ||
if driver != "devicemapper" { | ||
t.Logf("Active driver (%s) is not devicemapper, continuing anyway.", driver) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should t.Skip()
if it's not devicemapper.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a good measure to make sure the bug doesn't happen on other storage drivers too. I should probably just remove this check all-together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the right way to test storage drivers. These "integration" tests are very slow, we need to be conservative about the test we add here (see TESTING).
We should be able to test storage drivers without the daemon running, which makes it much easier and faster to test them. It looks like each graphdriver has a test suite which uses graphtest
. We should test this case in that suite.
Why is it not possible to just check that the mount point doesn't exist anymore from that test suite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would need to check again, but it's not that it's impossible, it's that I'd have to effectively start re-implementing parts of runc
inside the test framework to make sure we're testing the right interaction between mount namespaces and libdm
.
Just checking if the mountpoint has gone isn't good enough, you need to check that the mount has disappeared in other namespaces (that's where the bug comes from).
integration/container/run_test.go
Outdated
// torvalds/linux@8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe was merged). | ||
oldKernel := util.MustKernelCmp(kernel.VersionInfo{Kernel: 3, Major: 18, Minor: 0}) < 0 | ||
if oldKernel { | ||
t.Logf("NOTE: This test may fail on your <3.18 kernels.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, probably t.Skip()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some distributions may have backported the fix, so the kernel version alone isn't enough to know whether the test will work (but we don't hard fail on older kernels).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use kernel.CheckKernelVersion
from github.com/docker/docker/pkg/parsers/kernel
here. For example:
import "github.com/docker/docker/pkg/parsers/kernel"
...
oldKernel := !kernel.CheckKernelVersion(3, 18, 0)
...
@dnephin So where does that leave this PR's tests? Should I not include tests at all (these tests actually currently fail silently because the error handling in the new [As an aside, I'm not sure it was a good idea to deprecate |
I don't understand this. What do you mean by error handling doesn't exist?
It's a work in progress. If you see something significant that is missing, please let me know.
You can change tests, you just can't add new ones
The ideal place for this test would be in |
if err := system.EnsureRemoveAll(mountPath); err != nil { | ||
logrus.Debugf("devmapper: error doing a removeall on unmounted device %s: %v", mountPath, err) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use EnsureRemoveAll? This function is trying to do so much that I don't even understand what it is doing. After lazy unmount, why don't we simply do a os.Remove()
?
So idea is that use removing directory to yank the mount points which might have leaked into other mount namespaces and this will only work on newer kernels. I think this should work reasonably well. Little concerned about some use case being broken somewhere. There have been quite a few changes w.r.t referece counting. Especially make sure that the case of docker daemon being shutdown/start works while containers were running the whole time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, os.Remove
is probably a better choice.
Especially make sure that the case of docker daemon being shutdown/start works while containers were running the whole time.
Can you explain what you mean here? Are you referring to the "live restore" stuff?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I was referring to live restore functionality.
@@ -2416,6 +2417,15 @@ func (devices *DeviceSet) UnmountDevice(hash, mountPath string) error { | |||
} | |||
logrus.Debug("devmapper: Unmount done") | |||
|
|||
// Remove the mountpoint here. This is necessary to avoid cases where a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to tell that removing directory will remove the mount point on this directory from all the mount namespaces.
@cyphar We will require similar change in Remove() path as well? That is, make sure device is unmounted, then remove mnt directory and then try device deactivation and deletion? |
cc @rhatdan |
@rhvgoyal Yes probably. I'll need to figure out what's the best place to put this change (I'd prefer to not have to duplicate it in both paths). |
a875ebf
to
ba9ee05
Compare
Actually @rhvgoyal I don't think we need this in the Remove path. The Remove path is only called from the driver after it's been |
ef75b39
to
330461c
Compare
libdm currently has a fairly substantial DoS bug that makes certain operations fail on a libdm device if the device has active references through mountpoints. This is a significant problem with the advent of mount namespaces and MS_PRIVATE, and can cause certain --volume mounts to cause libdm to no longer be able to remove containers: % docker run -d --name testA busybox top % docker run -d --name testB -v /var/lib/docker:/docker busybox top % docker rm -f testA [fails on libdm with dm_task_run errors.] This also solves the problem of unprivileged users being able to DoS docker by using unprivileged mount namespaces to preseve mounts that Docker has dropped. SUSE-Bug: https://bugzilla.suse.com/show_bug.cgi?id=1045628 SUSE-Backport: moby#34573 Signed-off-by: Aleksa Sarai <[email protected]>
nudge |
Maybe a unit test that does uses unshare(1), and can skip if it's not available. Not exactly ideal, but ... |
The problem with such a test is that [Again, this would not be a problem if I could just write a normal integration test... 😞] |
Why would it break other unit tests? |
262017f
to
2ee2f84
Compare
In order to avoid reverting our fix for mount leakage in devicemapper, add a test which checks that devicemapper's Get() and Put() cycle can survive having a command running in an rprivate mount propagation setup in-between. While this is quite rudimentary, it should be sufficient. We have to skip this test for pre-3.18 kernels. Signed-off-by: Aleksa Sarai <[email protected]>
2ee2f84
to
1af8ea6
Compare
|
Restarted PowerPC; I was told there were some connection issue with Jenkins and the PowerPC nodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Thanks @cyphar!
/cc @vdemeester |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐮
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for writing the unit test
@dnephin Sorry, I never answered your question. I was worried about using |
libdm currently has a fairly substantial DoS bug that makes certain operations fail on a libdm device if the device has active references through mountpoints. This is a significant problem with the advent of mount namespaces and MS_PRIVATE, and can cause certain --volume mounts to cause libdm to no longer be able to remove containers: % docker run -d --name testA busybox top % docker run -d --name testB -v /var/lib/docker:/docker busybox top % docker rm -f testA [fails on libdm with dm_task_run errors.] This also solves the problem of unprivileged users being able to DoS docker by using unprivileged mount namespaces to preseve mounts that Docker has dropped. SUSE-Bug: https://bugzilla.suse.com/show_bug.cgi?id=1045628 SUSE-Backport: moby#34573 Signed-off-by: Aleksa Sarai <[email protected]>
libdm currently has a fairly substantial DoS bug that makes certain
operations fail on a libdm device if the device has active references
through mountpoints. This is a significant problem with the advent of
mount namespaces and MS_PRIVATE, and can cause certain --volume mounts
to cause libdm to no longer be able to remove containers:
This also solves the problem of unprivileged users being able to DoS
docker by using unprivileged mount namespaces to preseve mounts that
Docker has dropped.
Walter by Alexander Lemke.
Closes #34542
Signed-off-by: Aleksa Sarai [email protected]
/cc @rhvgoyal This is an alternative to #34542
/cc @vrothberg