Use mount.Target to specify subdirectory of rootfs mount#7840
Use mount.Target to specify subdirectory of rootfs mount#7840dmcgowan merged 1 commit intocontainerd:mainfrom
Conversation
|
Hi @hinshun. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
895ddf4 to
02a8f2c
Compare
dmcgowan
left a comment
There was a problem hiding this comment.
I would still like to understand the snapshotter use case better for doing this. I think there are other use cases around mount that should be considered alongside this, especially the chaining mounts (previous mount target has source for later mount) and custom mounts (mounts done through plugin or something other than a mount syscall).
I think this is probably standalone enough to consider on its own, assuming we can make the Target non-ambiguous in the interfaces.
| } | ||
|
|
||
| // Make the deepest mount be first | ||
| sort.Slice(targets, func(i, j int) bool { |
There was a problem hiding this comment.
Probably best to use SliceStable here
|
|
||
| // Make the deepest mount be first | ||
| sort.Slice(targets, func(i, j int) bool { | ||
| return len(targets[i]) > len(targets[j]) |
There was a problem hiding this comment.
Not sure you could safely assume this is the unmount order, perhaps using a prefix check rather than len check.
There was a problem hiding this comment.
There are some places which uses len check as well:
https://github.com/containerd/containerd/blob/v1.7.0-beta.1/mount/temp_unix.go#L48-L51
https://github.com/containerd/containerd/blob/v1.7.0-beta.1/pkg/cri/server/helpers_linux.go#L170-L173
Giving this a second thought... if there is an unmount order that is invalid, that would mean a parent mount was unmounted prior to a submount. By virtue of being a parent mount, its mountpoint is a strict prefix of the mountpoint of the submount, so a len check will be sufficient in this case. If it isn't a strict prefix, then the unmount order shouldn't matter.
| // function for FreeBSD, so instead we execute mount(8) and trust it to do | ||
| // the right thing | ||
| return m.mountWithHelper(target) | ||
| return m.mountWithHelper(filepath.Join(target, m.Target)) |
There was a problem hiding this comment.
Would it always be an error then to have a non-empty target in a mount list of 1? If this is the first mount, the directory won't exist.
There was a problem hiding this comment.
Yes it would always error.
Currently in my snapshotter, I prepare a lowerdir in an overlay mount with all the mountpoints already created before chaining a bind mount to a non-empty target. Do you think it should be the runtime's responsibility to ensure the existence of these mountpoints?
I am writing a remote snapshotter for nix to prepare rootfs of nix-based images directly from the local nix store rather than downloading blobs from the registry. The nix store is a readonly filesystem with no collision between installed packages. For example, all the paths necessary to execute We already upload all this data to S3 in nix's archive format, but then we'll have to repackage them into layers tarballs and upload to a registry to use them in containers. Putting multiple paths in a layer greatly duplicates the data, and separating paths into individual layers run into the 125 layer limit. In the current nix ecosystem, there are layering algorithms that bucket paths into layers based on "popularity", but it is only a heuristic. In the new nix snapshotter, we have a base overlay snapshotter that will handle compatibility with regular layers and provide a read write filesystem. A new mediatype is introduced that should be remotely unpacked, we then chain a series of bind mounts using remote snapshot annotations from the layer: Nix store paths are not relocatable so they need to be mounted in the same path in the container. Since the closure is completely self-contained, that's all that is necessary for it to function within the container. Since overlayfs doesn't propagate submounts from lowerdirs, this seems to be only option. With the nix snapshotter, we'll save a large amount of costs from transfer and storage duplicating our nix data in the OCI formats. |
02a8f2c to
c27a137
Compare
|
@dmcgowan I've changed |
e093e30 to
673aaaf
Compare
|
Overall this looks reasonable; one question: I see no validation of |
|
@estesp root joining the target seems like the right thing to do. If the target itself is under a directory which needs to be traversed inside the container, then it should not be traversed without root join. |
|
Try |
| return ErrNotImplementOnUnix | ||
| // UnmountRecursive unmounts the target and all mounts underneath, starting | ||
| // with the deepest mount first. | ||
| func UnmountRecursive(target string, flags int) error { |
There was a problem hiding this comment.
Should we use https://github.com/moby/sys/blob/c161267cd2212985f81a50a029001f82c91bca7f/mount/mount_unix.go#L41-L44 for this? (and/or see if things are missing in that implementation)
There was a problem hiding this comment.
Looks like we'll need to refactor UnmountAll to moby/sys/mount if we want to use that. It doesn't do the repeat unmount till EINVAL.
There was a problem hiding this comment.
Do we know the reason for the repeat unmount? (I've seen some code where I was wondering: "was this just a big hammer to fix a problem that wasn't fully understood")?
There was a problem hiding this comment.
Looks like it came from this discussion from PR:
#1141 (comment)
There was a problem hiding this comment.
Thanks for digging that up; yes, that "feels" like a "let's just use a bigger hammer" discussion, and could be worth looking into.
Main motivation for my comment was that I've seen various "subtle" issues with mount/unmount (and more notably; parsing mountinfo). Code like this can easily start spreading around and diverge, and I think it's a worthy investment to have a solid, well-tested, canonical implementation for such things (if possible).
Let me also ping @kolyshkin (who may have some thoughts as well, as he did a lot of digging into subtleties).
I agree with @dmcgowan that that should not be a blocker for this PR though; I think it's fine if we acknowledge this may need to be verified / looked into, and do that separately.
8973d15 to
5b6e5f5
Compare
|
Is there anything else blocking this PR? I made changes to use |
|
I think its good to go now. If we want to address using a another package or changing the unmount behavior, that should be done separately to this PR. |
| } | ||
|
|
||
| // UnmountRecursive is not implemented on this platform | ||
| func UnmountRecursive(mount string, flags int) error { |
There was a problem hiding this comment.
Looks like UnmountRecursive has a duplicate declaration for openbsd (defined both in _unix.go and _unsupported.go)
There was a problem hiding this comment.
Thank you, I fixed this and squashed my commits into one.
5b6e5f5 to
089da2e
Compare
| import ( | ||
| "fmt" | ||
|
|
||
| "github.com/containerd/continuity/fs" |
There was a problem hiding this comment.
I'll probably blame myself if I don't comment; was looking if this would be an issue to add as new dependency for this package (and if another approach would work). Looks like continuity was already a dependency (but for testing only). I didn't see a good alternative immediately though (resolving the path before calling the mount package didn't seem like an immediate option, so "probably" fine.
- Add Target to mount.Mount. - Add UnmountMounts to unmount a list of mounts in reverse order. - Add UnmountRecursive to unmount deepest mount first for a given target, using moby/sys/mountinfo. Signed-off-by: Edgar Lee <[email protected]>
089da2e to
34d5878
Compare
|
Thanks for the review. I double checked my work against the CONTRIBUTING guide and validated commit 34d5878 with my snapshotter. Anything else to do? |
Fixes #7839
Signed-off-by: Edgar Lee [email protected]