Skip to content

containerd-shim: Do not remount root MS_SLAVE#1141

Merged
estesp merged 4 commits intocontainerd:masterfrom
ijc:rootfsPropagation
Jul 21, 2017
Merged

containerd-shim: Do not remount root MS_SLAVE#1141
estesp merged 4 commits intocontainerd:masterfrom
ijc:rootfsPropagation

Conversation

@ijc
Copy link
Copy Markdown
Contributor

@ijc ijc commented Jul 7, 2017

Mounting as MS_SLAVE here breaks use cases which want to use
rootPropagation=shared in order to expose mounts to the host (and other
containers binding the same subtree).

Note that runc will also setup root as required by rootPropagation, defaulting
to MS_PRIVATE.

Fixes #1132.

Signed-off-by: Ian Campbell [email protected]

@codecov-io
Copy link
Copy Markdown

codecov-io commented Jul 7, 2017

Codecov Report

Merging #1141 into master will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@          Coverage Diff           @@
##           master   #1141   +/-   ##
======================================
  Coverage    28.1%   28.1%           
======================================
  Files          28      28           
  Lines        2832    2832           
======================================
  Hits          796     796           
  Misses       1887    1887           
  Partials      149     149

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0b3e572...d42cb88. Read the comment docs.

@crosbymichael
Copy link
Copy Markdown
Member

Because the shim mounts the rootfs, we also need to make sure it is unmounted before the shim exits with this change.

@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jul 10, 2017

Thanks, I'll look into that. Perhaps that need also explains the EBUSY I was seeing when I tried the option of removing the mount altogether in #1132 (comment).

@ijc ijc force-pushed the rootfsPropagation branch from a13cb2f to f2fac1e Compare July 11, 2017 13:36
@ijc ijc changed the title container-shim: Setup root as MS_SHARED instead of MS_SLAVE containerd-shim: Do not remount root MS_SLAVE Jul 11, 2017
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jul 11, 2017

Updated with a version which removes the mount completely and also cleans up the rootfs mount. PTAL.

(transcribed from #1132 (comment) which I confusingly put in the wrong place)

@ijc ijc force-pushed the rootfsPropagation branch 2 times, most recently from 1d46787 to 1f93aac Compare July 11, 2017 14:36
@crosbymichael
Copy link
Copy Markdown
Member

LGTM

Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable; just a few thoughts on the cleanup function

Comment thread linux/shim/init.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

relatively insignificant, but maybe it's simpler just to do nrRootMounts = len(r.Rootfs) here at the top and not worry about counting in the loop?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was deliberate in order to only undo the work we succeeded with. Although that is somewhat undermined by my not actually calling cleanupMounts in the error path for the Mount call itself!

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the more i look at this why don't you just loop until you receive a mount error that the path is not mounted instead of keeping this count around?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just loop until you hit EINVAL

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TL;DR, yes, I should loop until EINVAL.

I was thinking it was possible that the caller might have also mounted some stuff on the rootfs before calling create? It'd be weird for containerd to undo that. But thinking some more the rootfs here is entirely internal to containerd(-shim) and the user doesn't really formally know where it would be, which I suppose is why were have the rootfs slice in the first place...

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EINVAL covers a multitude of sins according to umount(2), including "bad flags" in addition to "not a mount point" (I'm not too worried about "called with MNT_EXPIRE and either MNT_DETACH or MNT_FORCE").

I think it's still the right way to go though.

Comment thread linux/shim/init.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about making this a defered function with a success variable declared as false before the defer definition. then the unmount can be gated by if !success {, and only set success to true at the one success path at the bottom before returning the pid?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WFM

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, wouldn't success be defined as success := err == nil. If so, can jut check if the err is nil at the top of the closure.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The err captured by the closure may not be the same err as gets assigned to further down the function, since err can be shadowed (e.g. if err := foo(); err != nil { return err } shadows the previous err, I think)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, my assumption is the err == nil defer style would require a careful undoing of err scoping in all the if blocks with a function-scoped var err error at the top, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even then I think you can shadow without the compiler warning about it if the new declaration happens in a nested scope, so there would be a danger of forgetting and doing an err := somewhere.

Comment thread mount/mount.go Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On error cases, won't this ultimately give a partially unwound stack of mounts anyway?
Should it just return on the first error?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured there was no harm in trying to do all of them, but yes it is likely not to do much in practice after the first failure and will certainly not unmount the full stack if one of them fails.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is still best to try.

ijc pushed a commit to ijc/linuxkit that referenced this pull request Jul 12, 2017
We expect this (or something very similar) to be merged soon, it fixes linuxkit#2131 so
moving ahead now.

The new alpine mirror is linuxkit/alpine:ce99c286eec27d0e722fc0e6ac2a3d2b1abc874e

Signed-off-by: Ian Campbell <[email protected]>
@ijc ijc force-pushed the rootfsPropagation branch from 1f93aac to 6ade6f4 Compare July 12, 2017 16:15
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jul 12, 2017

I made the change to use defer and a success flag.

@ijc ijc force-pushed the rootfsPropagation branch from 6ade6f4 to a8c2686 Compare July 13, 2017 08:09
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jul 13, 2017

Rebased onto alpha0.

@ijc ijc force-pushed the rootfsPropagation branch from 508eb0a to 3455ffc Compare July 13, 2017 09:15
ijc pushed a commit to ijc/linuxkit that referenced this pull request Jul 13, 2017
We expect this (or something very similar) to be merged soon, it fixes linuxkit#2131 so
moving ahead now.

The new alpine mirror is linuxkit/alpine:6832775a7e861ee2d7842e157688ece52d007142

Signed-off-by: Ian Campbell <[email protected]>
Copy link
Copy Markdown
Member

@estesp estesp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One minor nit: it seems idiomatic in Go to not assign zero values; so var success bool is usually more accepted than success := false

Regardless, LGTM

Any other maintainers want to weigh in? I believe @ijc has responded to all feedback

@ijc ijc force-pushed the rootfsPropagation branch from 3455ffc to b37bb5e Compare July 19, 2017 13:58
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jul 19, 2017

Thanks @estesp I fixed the zero val assign.

Mounting as MS_SLAVE here breaks use cases which want to use
rootPropagation=shared in order to expose mounts to the host (and other
containers binding the same subtree), mounting as e.g. MS_SHARED is pointless
in this context so just remove.

Having done this we also need to arrange to manually clean up the mounts on
delete, so do so.

Note that runc will also setup root as required by rootPropagation, defaulting
to MS_PRIVATE.

Fixes containerd#1132.

Signed-off-by: Ian Campbell <[email protected]>
Ian Campbell added 3 commits July 20, 2017 10:50
This avoids someone adding a new error path and forgetting to call the cleanup
function.

We prefer to use an explicit flag to gate the clean rather than relying on `err
!= nil` so we don't have to rely on people never accidentally shadowing the
`err` as seen by the closure.

Signed-off-by: Ian Campbell <[email protected]>
This is simpler than trying to count how many successful mounts we made.

Signed-off-by: Ian Campbell <[email protected]>
@ijc ijc force-pushed the rootfsPropagation branch from b37bb5e to d42cb88 Compare July 20, 2017 10:09
@ijc
Copy link
Copy Markdown
Contributor Author

ijc commented Jul 20, 2017

Rebased onto alpha1.

@stevvooe
Copy link
Copy Markdown
Member

LGTM

This looks okay. The only area that I'm worried about is unmount mounts that weren't mounted by the shim.

@estesp estesp merged commit a2df6d1 into containerd:master Jul 21, 2017
@ijc ijc deleted the rootfsPropagation branch July 24, 2017 08:32
arm64b added a commit to arm64b/linuxkit-arm64 that referenced this pull request Jul 26, 2017
Since 'containerd/containerd#1141' has been
merged into containerd git repo, so we don't need the below repo
'https://github.com/ijc/containerd.git' as the containerd git tree in
that pkg. Moreover, with the 'ijc' containerd git tree, we'll have
below error message when build the pkg/containerd:

fatal: reference is not a tree: d42cb88ba2b08d2ca6c8c017d629b394bf1dd08c
The command '/bin/sh -c git checkout $CONTAINERD_COMMIT' returned a
non-zero code: 128.

Except that above issue, we need to update the "github.com/Sirupsen/logrus"
pkg imported, because the latest code has change the 'Sirupsen' to 'sirupsen',
else the build process will complain that that pkg can't be found.

We also need to update the caller of NewTask(.) adapting to the latest
implementation of NewTask() to avoid the type conflict issue.

Signed-off-by: Dennis Chen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants