Skip to content

Use mount.Target to specify subdirectory of rootfs mount#7840

Merged
dmcgowan merged 1 commit intocontainerd:mainfrom
hinshun:feature/mount-subdirectory
Jan 31, 2023
Merged

Use mount.Target to specify subdirectory of rootfs mount#7840
dmcgowan merged 1 commit intocontainerd:mainfrom
hinshun:feature/mount-subdirectory

Conversation

@hinshun
Copy link
Copy Markdown
Contributor

@hinshun hinshun commented Dec 19, 2022

Fixes #7839


  • Add Target to mount.Mount
  • Add UnmountAllRecursive to unmount deepest mount first for a list of mount.Mount.
  • Add UnmountRecursive to unmount deepest mount first for a given target, using /proc/mountinfo.

Signed-off-by: Edgar Lee [email protected]

@k8s-ci-robot
Copy link
Copy Markdown

Hi @hinshun. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hinshun hinshun force-pushed the feature/mount-subdirectory branch from 895ddf4 to 02a8f2c Compare December 19, 2022 06:59
@dmcgowan dmcgowan added the status/needs-discussion Needs discussion and decision from maintainers label Dec 20, 2022
Copy link
Copy Markdown
Member

@dmcgowan dmcgowan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like to understand the snapshotter use case better for doing this. I think there are other use cases around mount that should be considered alongside this, especially the chaining mounts (previous mount target has source for later mount) and custom mounts (mounts done through plugin or something other than a mount syscall).

I think this is probably standalone enough to consider on its own, assuming we can make the Target non-ambiguous in the interfaces.

Comment thread mount/mount.go Outdated
}

// Make the deepest mount be first
sort.Slice(targets, func(i, j int) bool {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to use SliceStable here

Comment thread mount/mount.go Outdated

// Make the deepest mount be first
sort.Slice(targets, func(i, j int) bool {
return len(targets[i]) > len(targets[j])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure you could safely assume this is the unmount order, perhaps using a prefix check rather than len check.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some places which uses len check as well:
https://github.com/containerd/containerd/blob/v1.7.0-beta.1/mount/temp_unix.go#L48-L51
https://github.com/containerd/containerd/blob/v1.7.0-beta.1/pkg/cri/server/helpers_linux.go#L170-L173

Giving this a second thought... if there is an unmount order that is invalid, that would mean a parent mount was unmounted prior to a submount. By virtue of being a parent mount, its mountpoint is a strict prefix of the mountpoint of the submount, so a len check will be sufficient in this case. If it isn't a strict prefix, then the unmount order shouldn't matter.

Comment thread mount/mount_freebsd.go Outdated
// function for FreeBSD, so instead we execute mount(8) and trust it to do
// the right thing
return m.mountWithHelper(target)
return m.mountWithHelper(filepath.Join(target, m.Target))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it always be an error then to have a non-empty target in a mount list of 1? If this is the first mount, the directory won't exist.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it would always error.

Currently in my snapshotter, I prepare a lowerdir in an overlay mount with all the mountpoints already created before chaining a bind mount to a non-empty target. Do you think it should be the runtime's responsibility to ensure the existence of these mountpoints?

@hinshun
Copy link
Copy Markdown
Contributor Author

hinshun commented Dec 23, 2022

I would still like to understand the snapshotter use case better for doing this.

I am writing a remote snapshotter for nix to prepare rootfs of nix-based images directly from the local nix store rather than downloading blobs from the registry. The nix store is a readonly filesystem with no collision between installed packages. For example, all the paths necessary to execute bin/hello which outputs Hello World! are the following paths:

/nix/store/34xlpp3j3vy7ksn09zh44f1c04w77khf-libunistring-1.0
/nix/store/5mh5019jigj0k14rdnjam1xwk5avn1id-libidn2-2.3.2
/nix/store/4nlgxhb09sdr51nc9hdm8az5b08vzkgx-glibc-2.35-163
/nix/store/g2m8kfw7kpgpph05v2fxcx4d5an09hl3-hello-2.12.1

We already upload all this data to S3 in nix's archive format, but then we'll have to repackage them into layers tarballs and upload to a registry to use them in containers. Putting multiple paths in a layer greatly duplicates the data, and separating paths into individual layers run into the 125 layer limit. In the current nix ecosystem, there are layering algorithms that bucket paths into layers based on "popularity", but it is only a heuristic.

In the new nix snapshotter, we have a base overlay snapshotter that will handle compatibility with regular layers and provide a read write filesystem. A new mediatype is introduced that should be remotely unpacked, we then chain a series of bind mounts using remote snapshot annotations from the layer:

/nix/store/34xlpp3j3vy7ksn09zh44f1c04w77khf-libunistring-1.0 mount to /<rootfs>/nix/store/34xlpp3j3vy7ksn09zh44f1c04w77khf-libunistring-1.0
...

Nix store paths are not relocatable so they need to be mounted in the same path in the container. Since the closure is completely self-contained, that's all that is necessary for it to function within the container.

Since overlayfs doesn't propagate submounts from lowerdirs, this seems to be only option. With the nix snapshotter, we'll save a large amount of costs from transfer and storage duplicating our nix data in the OCI formats.

@hinshun hinshun force-pushed the feature/mount-subdirectory branch from 02a8f2c to c27a137 Compare January 17, 2023 16:10
@hinshun
Copy link
Copy Markdown
Contributor Author

hinshun commented Jan 17, 2023

@dmcgowan I've changed UnmountAll to simply unmount in reverse order of the list of mounts provided, and added more documentation in godocs and in docs/snapshotters/README.md.

@dmcgowan dmcgowan self-assigned this Jan 17, 2023
@dmcgowan dmcgowan removed the status/needs-discussion Needs discussion and decision from maintainers label Jan 17, 2023
@hinshun hinshun force-pushed the feature/mount-subdirectory branch 3 times, most recently from e093e30 to 673aaaf Compare January 17, 2023 19:44
@dmcgowan dmcgowan added this to the 1.7 milestone Jan 19, 2023
@estesp
Copy link
Copy Markdown
Member

estesp commented Jan 19, 2023

Overall this looks reasonable; one question: I see no validation of m.Target all the places it is now used; do we need to check any bounds/safety conditions (symlinks, relative dir paths, rooted paths?)

@dmcgowan
Copy link
Copy Markdown
Member

@estesp root joining the target seems like the right thing to do. If the target itself is under a directory which needs to be traversed inside the container, then it should not be traversed without root join.

@dmcgowan dmcgowan self-requested a review January 20, 2023 04:40
@dmcgowan
Copy link
Copy Markdown
Member

Try fs.RootPath

Comment thread mount/mount_unix.go
return ErrNotImplementOnUnix
// UnmountRecursive unmounts the target and all mounts underneath, starting
// with the deepest mount first.
func UnmountRecursive(target string, flags int) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we'll need to refactor UnmountAll to moby/sys/mount if we want to use that. It doesn't do the repeat unmount till EINVAL.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know the reason for the repeat unmount? (I've seen some code where I was wondering: "was this just a big hammer to fix a problem that wasn't fully understood")?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it came from this discussion from PR:
#1141 (comment)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging that up; yes, that "feels" like a "let's just use a bigger hammer" discussion, and could be worth looking into.

Main motivation for my comment was that I've seen various "subtle" issues with mount/unmount (and more notably; parsing mountinfo). Code like this can easily start spreading around and diverge, and I think it's a worthy investment to have a solid, well-tested, canonical implementation for such things (if possible).

Let me also ping @kolyshkin (who may have some thoughts as well, as he did a lot of digging into subtleties).

I agree with @dmcgowan that that should not be a blocker for this PR though; I think it's fine if we acknowledge this may need to be verified / looked into, and do that separately.

@hinshun hinshun force-pushed the feature/mount-subdirectory branch 6 times, most recently from 8973d15 to 5b6e5f5 Compare January 23, 2023 18:36
@hinshun
Copy link
Copy Markdown
Contributor Author

hinshun commented Jan 25, 2023

Is there anything else blocking this PR? I made changes to use fs.RootPath where appropriate.

@dmcgowan
Copy link
Copy Markdown
Member

I think its good to go now. If we want to address using a another package or changing the unmount behavior, that should be done separately to this PR.

}

// UnmountRecursive is not implemented on this platform
func UnmountRecursive(mount string, flags int) error {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like UnmountRecursive has a duplicate declaration for openbsd (defined both in _unix.go and _unsupported.go)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I fixed this and squashed my commits into one.

@hinshun hinshun force-pushed the feature/mount-subdirectory branch from 5b6e5f5 to 089da2e Compare January 26, 2023 14:30
Comment thread mount/mount.go
import (
"fmt"

"github.com/containerd/continuity/fs"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll probably blame myself if I don't comment; was looking if this would be an issue to add as new dependency for this package (and if another approach would work). Looks like continuity was already a dependency (but for testing only). I didn't see a good alternative immediately though (resolving the path before calling the mount package didn't seem like an immediate option, so "probably" fine.

Copy link
Copy Markdown
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

- Add Target to mount.Mount.
- Add UnmountMounts to unmount a list of mounts in reverse order.
- Add UnmountRecursive to unmount deepest mount first for a given target, using
moby/sys/mountinfo.

Signed-off-by: Edgar Lee <[email protected]>
@hinshun hinshun force-pushed the feature/mount-subdirectory branch from 089da2e to 34d5878 Compare January 27, 2023 01:52
@hinshun
Copy link
Copy Markdown
Contributor Author

hinshun commented Jan 30, 2023

Thanks for the review. I double checked my work against the CONTRIBUTING guide and validated commit 34d5878 with my snapshotter. Anything else to do?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Allow snapshotters to define submounts under target mountpoint

5 participants