Skip rootfs unmount when no mounts are provided#3148
Skip rootfs unmount when no mounts are provided#3148crosbymichael merged 3 commits intocontainerd:masterfrom
Conversation
676e0f9 to
ff7e0d4
Compare
Codecov Report
@@ Coverage Diff @@
## master #3148 +/- ##
=======================================
Coverage 45.16% 45.16%
=======================================
Files 111 111
Lines 11962 11962
=======================================
Hits 5403 5403
Misses 5727 5727
Partials 832 832
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #3148 +/- ##
==========================================
+ Coverage 45.14% 45.22% +0.08%
==========================================
Files 111 111
Lines 11971 11994 +23
==========================================
+ Hits 5404 5424 +20
- Misses 5733 5736 +3
Partials 834 834
Continue to review full report at Codecov.
|
|
/cc @AkihiroSuda |
There was a problem hiding this comment.
rootfs should be initialized as an empty string and set to filepath.Join(r.Bundle, "rootfs") only when len(mounts) > 0?
|
LGTM except a nit v2 runtime needs this fix as well? (Can be a separate PR)
We should have integration test suites for both "root in userns" and "non-root outside userns" mode so as to clarify what are expected to work and what are not. |
|
nit: |
|
hi @georgethebeatle , The current implementation ignore the and it seems that we cannot mount without sudo and maybe we can add some user in |
Co-authored-by: Julia Nedialkova <[email protected]> Signed-off-by: Georgi Sabev <[email protected]>
ff7e0d4 to
13d9ce9
Compare
|
@fuweid If you pass rootfs mounts using the |
|
@AkihiroSuda we have addressed your comments and v2 is now supported.
Do you think that adding those tests is in the scope of this PR or is this a suggestion for future work? |
* Rootfs dir is created during container creation not during bundle creation * Add support for v2 * UnmountAll is a no-op when the path to unmount (i.e. the rootfs dir) does not exist or is invalid Co-authored-by: Danail Branekov <[email protected]> Signed-off-by: Georgi Sabev <[email protected]>
13d9ce9 to
c0f0b21
Compare
|
@georgethebeatle you can try oci.WithRootFSPath which shim will not mount it , because it is provided by client. But the shim still tries to unmount it after cleanup. That is the problem. If we can know that it is not mount point, we can skip it. |
|
Test can be future work |
|
@fuweid If you provide any mounts via As far as I know we cannot check if the rootfs directory is a mount point as a non-root user, because it will involve making a |
|
@AkihiroSuda @fuweid Do you think the PR is mergeable as it is or do we need to do some more work? |
| // useful for undoing a stack of mounts on the same mount point. | ||
| func UnmountAll(mount string, flags int) error { | ||
| if _, err := os.Stat(mount); err != nil { | ||
| return nil |
There was a problem hiding this comment.
return err? or needs os.IsNotExist check?
There was a problem hiding this comment.
We thought the same, but when you do os.Stat("") it returns an EINVAL error as you can see. We want to skip the unmount in 2 cases - if it is an invalid path ("") or if it does not exist on disk. That's why we do not do the os.IsNotExist check. Our thinking was that If you cannot stat the directory it is doubtful if you will be able to unmount it anyway.
If you prefer we can have 2 checks instead - a check for an empty string and an os.IsNotExist check. What do you think?
There was a problem hiding this comment.
we can check the rootfs in the shim, not in mount package. mount package don't need to carry this logic.
Co-authored-by: Danail Branekov <[email protected]> Signed-off-by: Georgi Sabev <[email protected]>
|
@georgethebeatle sorry for the late reply.
If I understand correctly, the
The I use the following command to run it and it will not unmount the provided rootfs. cc @AkihiroSuda |
|
@fuweid Your suggestion to check whether a path is a mountpoint prior unmounting is racy in the rootful case. In a concurrent scenario where two deletes are running in parallel it may happen that |
|
@fuweid We applied the change you suggested adding some debug logging: diff --git a/mount/mount_linux.go b/mount/mount_linux.go
index b5a16148..86f4e87b 100644
--- a/mount/mount_linux.go
+++ b/mount/mount_linux.go
@@ -18,6 +18,7 @@ package mount
import (
"fmt"
+ "io/ioutil"
"os"
"path"
"strings"
@@ -112,6 +113,16 @@ func unmount(target string, flags int) error {
// are no mounts remaining (EINVAL is returned by mount), which is
// useful for undoing a stack of mounts on the same mount point.
func UnmountAll(mount string, flags int) error {
+ isMount, err := isMountpoint(mount)
+ if err != nil {
+ return err
+ }
+
+ if !isMount {
+ fmt.Printf("Skipping unmount of: %s\n", mount)
+ return nil
+ }
+
for {
if err := unmount(mount, flags); err != nil {
// EINVAL is returned if the target is not a
@@ -127,6 +138,32 @@ func UnmountAll(mount string, flags int) error {
}
}
+func isMountpoint(path string) (bool, error) {
+ bs, err := ioutil.ReadFile("/proc/mounts")
+ if err != nil {
+ return false, err
+ }
+
+ lines := strings.Split(string(bs), "\n")
+ for _, line := range lines {
+ if line == "" {
+ continue
+ }
+
+ fields := strings.Fields(line)
+ if len(fields) != 6 {
+ continue
+ }
+
+ if fields[1] == path {
+ return true, nil
+ }
+ }
+
+ fmt.Printf("\n Mount table is \n%s\n\n", string(bs))
+ return false, nil
+}
+
// parseMountOptions takes fstab style mount options and parses them for
// use with a standard mount() syscall
func parseMountOptions(options []string) (int, string) {
Then we ran containerd's integration tests for the v2 runtime with the following command: It failed on the 80th run with the following error: Looking at our debug logs it seems like We did the same experiment using the docker Any thoughts? cc @AkihiroSuda |
|
It's been a while since we last updated this PR. Do you have any thoughts or suggestions? How should we proceed? |
| rootfs := "" | ||
| if len(mounts) > 0 { | ||
| rootfs = filepath.Join(r.Bundle, "rootfs") | ||
| if err := os.Mkdir(rootfs, 0711); err != nil { |
There was a problem hiding this comment.
We just moved that code here from
containerd/runtime/v1/linux/bundle.go
Lines 47 to 49 in 835e6d0
0711 and if it is safe to change it.
|
@danail-branekov @georgethebeatle sorry for the late reply.
@danail-branekov @georgethebeatle I was thinking we can check the mount table in the shim side. Anyway, |
Not quite. The fact that the mount point check is racy can have 2 effects:
|
|
Yeah. My point is that we can check mount table in the runtime(shim) side, not in the unmount function. The Empty rootfs is easy to make it happen, I agree. We can do skip-unmount-checker in shim side, not in the unmount side, because we cannot let all the cover things in mount package. it is just my two cents. Thanks |
|
@fuweid I feel like we are kind of on the same page but do not understand whether you want us to do some more work. Should we do something more to get this merged and what do you suggest we do? |
|
@fuweid About your proposition that we do a mountpoint check in
What do you think? |
|
Hi @georgethebeatle sorry for the late reply.
I have checked your code. diff --git a/runtime/v2/bundle.go b/runtime/v2/bundle.go
index 7139410..7897a1c 100644
--- a/runtime/v2/bundle.go
+++ b/runtime/v2/bundle.go
@@ -89,10 +89,6 @@ func NewBundle(ctx context.Context, root, state, id string, spec []byte) (b *Bun
}
}
paths = append(paths, work)
- // create rootfs dir
- if err := os.Mkdir(filepath.Join(b.Path, "rootfs"), 0711); err != nil {
- return nil, err
- }
// symlink workdir
if err := os.Symlink(work, filepath.Join(b.Path, "work")); err != nil {
return nil, errAnd tried to run it with v2 shim binary in my local [no idea why the CI is still pass 😂 ] I think we should not just skip the BTW, Do you consider to use v2 shim in your production? It seems that we can handle the unmount thing in custom shim binary. |
|
Hi @fuweid On a local environment with the PR changes we were able to successfully execute the following command: We were able to get the error you are talking about by applying the diff you gave above to master (which is only a small part of the PR), but could you explain exactly what you did? Apart from that we experimented with what you suggested: we moved the UnmountAll call to the exitHandler of the shim (in v1) diff --git a/runtime/v1/linux/runtime.go b/runtime/v1/linux/runtime.go
index 545e2167..c2f889d3 100644
--- a/runtime/v1/linux/runtime.go
+++ b/runtime/v1/linux/runtime.go
@@ -190,12 +190,17 @@ func (r *Runtime) Create(ctx context.Context, id string, opts runtime.CreateOpts
cgroup = v.(*runctypes.CreateOptions).ShimCgroup
}
exitHandler := func() {
+ log.G(ctx).WithField("id", id).Info("short cirquiting")
+ if err2 := mount.UnmountAll(filepath.Join(bundle.path, "rootfs"), 0); err2 != nil {
+ log.G(ctx).WithError(err2).Warn("Failed to cleanup rootfs mount")
+ }
log.G(ctx).WithField("id", id).Info("shim reaped")
t, err := r.tasks.Get(ctx, id)Unfortunately the tests started failing with the same error as mentioned above: #3148 (comment): This time the race would happen every time we run the tests, rather than once in 80 runs, so it seems to have gotten worse. We are still reluctant to introduce a racey check. Do you have any further thoughts? cc @AkihiroSuda what do you think? |
|
Hi @georgethebeatle I just applied the change in master and run a container. step 1: clean up step2: make and restart step3: pull image and run it with runtime v2 API @georgethebeatle I think we can give up the mount check idea 😂 Focus on your original idea. I miss the part about create rootfs in shim side. you are right, I think we should create rootfs in shim side. If there is provided rootfs, we should not create the folder. It seems that check empty/non-exist folder in mount.Unmount is the way to reduce duplicate check in code base. LGTM. Sorry for taking it so long. But I would like to invite @dmcgowan to check the mount part. :) |
|
LGTM |
|
I have question regarding this PR. It was merged to master in between |
|
@georgethebeatle the master branch is used to work on containerd 1.3.0; there's release branches for the 1.0, 1.1.x and 1.2.x releases; to include something in one of those releases, fixes have to be back ported / cherry-picked (e.g. see https://github.com/containerd/containerd/pulls?q=is%3Apr+release+in%3Atitle+is%3Aclosed) |
We (Cloud Foundry Garden) are trying to run containerd as a non-root user for extra security. We have our own component for creating container images so we are not using a snapshotter. While this mostly works we hit the following problem:
As of today this works because containerd is usually run as uid 0 in its respective user namespace, but honestly it feels a bit asymmetric and is breaking our approach to rootless containers which is to run the containerd daemon as a non-root user.
We have no issues with task creation because we are not using a snapshotter, so as mentioned above no mount calls are performed by the shim. However, when we attempt a destroy we hit the unconditional unmount call and as you can imagine this results in an
operation not permittederror.This PR suggests a fix which makes sure that the shim will not do any unmount calls when there is nothing to unmount (no rootfs mounts were passed in to the request, because no snapshotter was used).
Any comments and suggestions are welcome.
Co-authored-by: Julia Nedialkova [email protected]
Signed-off-by: Julia Nedialkova [email protected]