-
Notifications
You must be signed in to change notification settings - Fork 3.8k
[Reverted in #8838] Add support for bind-mounts on Darwin (a.k.a. "make native snapshotter work") #8789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi @slonopotamus. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
7582beb to
329fb10
Compare
| ) | ||
|
|
||
| // fuseSuperMagic is defined in statfs(2) | ||
| const fuseSuperMagic = 0x65735546 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code can possibly be enabled on FreeBSD, but it needs testing, so I'd prefer to leave it Linux-specific as it was for now.
|
Some small things to start, but I need to give the actual approach some thought later today. cc @mxpv for another macOS user 😆 |
329fb10 to
ee83648
Compare
Well, my whole idea of this PR is that we just do what other platforms do - we bind-mount (although in a less traditional way, via FUSE bindfs). |
f4ad8a1 to
3a6f893
Compare
5d80864 to
4f4c2ce
Compare
|
I believe that Linux sandbox CI failure is unrelated to this PR. |
|
Sorry for the delay here. My main concern is everything kind of feels.. hacky? for our options on Darwin which is just due to the state of things: there's no kernel support for any of the usual niceties which I'm sure you know at this point 😪. Though, I see the value in "deploy/manage thing like a container even if all the container bits aren't fully there" as there's HostProcess containers on Windows. It feels off that we need to have some third party fuse kext (or the usermode version when/if that issue is fixed) to get this to work, but I'm not completely opposed to giving this a shot as we can experiment while we work towards 2.0. My minds wandering on if there's some way to achieve what we want here just as a remote snapshotter or similar 🤔. I do think if we did get this in we'll have to add some docs stating for Darwin if you want to use the native snapshotter what dependencies you'll now need |
|
There was some discussion about mount plugins at #7055 (comment), and #8347 probably falls into the same area-of-interest. |
I'd rephase this "everything feels an MVP". I do not claim that what I did here is the macOS snapshotter. It is just 35 lines of code that give us a snapshotter and allow to move on with experimenting on runtime. As far as I understand, this is the whole idea of I also want to note that attempts to bring in a more full-blown solution are already stuck for two years. Also, I have to plans to merge rund into containerd. What I'm trying to patch on containerd side is bare minimum that would make things work.
Btw, all those were fixed. |
|
Hacky wasn't a shot at this work, more at the current state of what we have to work with on Mac. It's hard to make a call on any of the solutions people have been trialing given that, but this seems the most container-like. Like I said I think now is a decent time to experiment given the next release is up for some breaking changes, and if we find something new and shiny we can pivot or if not and it doesn't go as planned we can go back to the drawing board.
Sweet! |
There's another path of Hypervisor framework runtime that would produce containers kinda similar to Hyper-V isolation Windows. By no means I'm trying to close that door.
I'm no tying my work to any of containerd releases. rund is deliberately an out-of-tree shim and it may never become of sufficient quality that would be worth PRing into containerd. I also work on it in my spare time, so no specific calendar milestones and work often gets interrupted by higher-priority tasks. So, I say it again: I treat this PR as a smallest amount of changes to containerd that produces a functional snapshotter on macOS (currently there are none). |
Yea I know this is focused on Mac native containers and not linux on darwin, although I think there's obviously merit for that
By experiment for the next release I mean anything added to containerd (e.g. this change or any future that you may end up needing to trial MacOS bits) not the shim you're working on |
Nono, you can also run macOS inside Hypervisor. See Tart for example. For some reason they decided to make a separate command-line tool instead shaping their tool as a shim, but at least we can use Tart to see what is possible using Hypervisor in principle. |
|
Also I'm planning to block off some time to try rund next week 😆 |
It's a bit too early :D It's missing a process subreaper yet, so whatever involving more that one process leaks stuff and tends to hang during shutdown. |
Oop, I shouldn't have assumed which platform. My mind always goes to Linux for virt |
|
The idea (in my head albeit right now) would be to have a plugin that you can optionally compile in that would handle these. We'd have a standardized cmdline interface for the binaries (must have both mount and unmount commands, support {insert flags} etc.) and you'd communicate with it via a new grpc API probably. So instead of mount.MountAll, you'd call client.MountAll or something similar. If the plugin doesn't have a mapping for that mount type it'll just use the builtin versions (e.g just call mount.Mount internally). |
|
@dcantah Feel free to ping me when there's something to play with. I'll try to use that stuff to my macOS needs. For now... Looks like |
Revert "Add support for bind-mounts on Darwin (a.k.a. "make native snapshotter work")" (#8789)
|
As I understand, #5935 should be closed with the same reasoning. |
#8789 (comment) The PR was merged too quickly without establishing consensus, so we wanted to revert this until we can find a proper interface to shell out a third party mount helper without depending on it by default. |
|
@AkihiroSuda thanks, makes sense. |
You're missing the point. Yes, current containerd code allows snapshotter to download and unpack images. No, current containerd code is not allowing to Just Write a runtime because certain codepaths inside containerd itself expect to be able to mount/unmount. |
|
@dcantah Any updates on the mount plugin idea? It would be great to see a common solution to this. |
|
@bergwolf I've been working on it in my spare time since about a week and a half ago. Curious, do you have a use case? The only one I could think of so far is this Darwin case so you could "overload" what a bind mount does and proxy it to an external binary. |
|
If there's appetite for this outside of that case I can put some more priority into the work |
|
What about this?
Originally posted by @AkihiroSuda in moby/buildkit#4059 (comment) |
|
@dcantah Yes, we do have a use case. I'm working on Nydus and right now we use a binary mount helper to drop certain options returned by the snapshotter. We've opened #6746 to extened the Mount struct but clearly that is not an acceptable solution. So if an external mount plugin is preferred, we'd appreciate to support it so that we can drop the binary mount helper. |
|
@AkihiroSuda Still, we'd have to maintain and ship an external binary. Is it possible to optionally register a GRPC endpoint (as @dcantah proposed) instead? |
Awesome okay, I'll prioritize drafting up my initial ideas into an issue then and then let me know if whatever the model is would work for your use case! |
I guess we need both, for those who consume containerd as a library. |
Thanks for prioritizing this work:)
|
|
@jiangliu My idea so far is your mount helper could do whatever it wants. You'd receive all of the args and options that the mount syscall would receive and you're free to interpret or not honor any of them, much in the same way your helper does right now with ignoring the extra data. We'd modify the bits in containerd that call mount (runc shim for example) to call into the mount plugin instead. If you configured a mount type to point to your binary in containerds config file mount plugin section, then containerd upon receiving the request will invoke your binary and pass over everything it needs to perform the mount (this is undecided so far, e.g. just pass args on the cmdline or have the binary read a json blob from stdin, either might work). We'd have a section in containerds config where you could setup mappings from mount type to binary: where anything that doesn't have a registered "override" would just have containerd call |
| // Mount to the provided target. | ||
| func (m *Mount) mount(target string) error { | ||
| // See https://github.com/slonopotamus/containerd-darwin-mount-helper for reference implementation | ||
| const commandName = "containerd-darwin-mount-helper" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering if we can reach consensus on reviving this PR if this is just an environment variable like CONTAINERD_DARWIN_MOUNT_HELPER or CONTAINERD_DARWIN_MOUNT_HELPER_EXPERIMENTAL that takes the path of the binary that is compatible with https://github.com/slonopotamus/containerd-darwin-mount-helper .
cc @containerd/committers
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CONTAINERD_DARWIN_MOUNT_HELPER_EXPERIMENTAL sounds fairly reasonable to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a big fan of the generic helper interface which would be Darwin specific and not a well defined plugin interface. The idea around a large mount plugin interface doesn't really need to apply here either though, as the idea behind that interface before was always to go from custom mount types to system mount types. Even though "bindfs" is not a default util, I think we could easily just call out to "bindfs" and "hdiutil" in the Darwin mount code to cover a large range of the normal "system" mounts that would be expected there. Since "bindfs" is not a default, we could make that swappable or disabled via environment variables, but still utilizing the bindfs args. For apfs or other block type mounts, hdiutil can be relied on without needing to have custom binaries which just wrap hdiutil.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slonopotamus @mxpv WDYT? ↑
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoops, I somehow missed this.
- Both containerd and buildkit expect across numerous places in codebase that they can just
umount()things. bindfs working through either macFUSE or fuse-t supports that. - Containerd expects that bind-mounts exist and work. Even
nativesnapshotter depends on bind mounts: https://github.com/containerd/containerd/blob/v2.0.0-rc.4/plugins/snapshots/native/native_default.go#L21, that's why darwin-containers@416899f hack was applied (it can be reverted if we have proper bind-mounts). - I agree with @dmcgowan that introducing a plugin API is an overkill. And it is very likely to be a weird plugin API: do we expect that a single helper is supposed to support all kinds of mounts? If there are multiple helpers, codebase needs to track which helper was used for mounting to use the same helper for unmount. This even can be across processes (containerd cleans up things in case shim dies).
- Yes, binds is kind of a hack. Nobody says currently this is a production-scale code. If there emerges a better solution in the future, I will raise both of my hands for using it.
- We're talking about a diff with negative line delta :)
So, I'd like to split concerns.
First one is the need of bind-mounts that is not going to go anywhere. If you have other solutions besides bindfs, let's discuss them. if there are none, I don't see what's the benefit of adding wrapper layer. And API that is architected based on a single usecase usually fails to support the second when it emerges.
Second one is ineffectiveness of native snapshotter. This is a separate issue that can be solved independently (either via sparse-bundle approach suggested in #5935 or some FUSE solution or new FSKit API). Even if native snapshotter is slow, it is functionally correct and allows experimenting. My changes do not aim to solve this issue, I am only concentrated on bind-mounts that will be needed regardless of what snapshotter is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AkihiroSuda Let's suppose someone created a bindfs-like FOSS filesystem based on FSKit. Would dependency on such thing be considered acceptable for containerd as a bind mounts implementation? According to FSKit documentation, it would be used via mount -t MyAwesomeBind /src /dst. Plain unmount would also work for this fs. Given that Darwin doesn't have builtin bind fs type, such implementation could even just use that name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it should be rather a built-in feature of containerd
Well, the issue is that FSKit API is currently only available for Swift and Objective-C. Quick Google search suggests it is possible to call Objective-C via cgo, though that might be not a pleasant experience.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just call out to hdiutil for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we just call out to
hdiutilfor now?
No, because we don't know what args to pass to hdiutil yet (that imaginary FSKit-based bind-mount filesystem is nonexistent currently).
…implementation After these changes, in order to add Darwin bind-mount implementation, one only needs: * Adjust HasBindMounts definition in mount.go * Provide implementation in mount_darwin.go As a bonus, Linux FUSE-related code was moved to a separate file and possibly could be reused on FreeBSD, though this needs testing. There was no consensus on adding dependency on bindfs, that seems to be the only working solution for bind-mounts on Darwin as of today, in containerd#8789, that's why the actual implementation is not added in current PR.
…implementation After these changes, in order to add Darwin bind-mount implementation, one only needs: * Adjust HasBindMounts definition in mount.go * Provide implementation in mount_darwin.go As a bonus, Linux FUSE-related code was moved to a separate file and possibly could be reused on FreeBSD, though this needs testing. There was no consensus on adding dependency on bindfs, that seems to be the only working solution for bind-mounts on Darwin as of today, in containerd#8789, that's why the actual implementation is not added in current PR. Signed-off-by: Marat Radchenko <[email protected]>
…implementation After these changes, in order to add Darwin bind-mount implementation, one only needs: * Adjust HasBindMounts definition in mount.go * Provide implementation in mount_darwin.go There was no consensus on adding dependency on bindfs, that seems to be the only working solution for bind-mounts on Darwin as of today, in containerd#8789, that's why the actual implementation is not added in current PR. As a bonus, Linux FUSE-related code was moved to a separate file and possibly could be reused on FreeBSD, though this needs testing. Signed-off-by: Marat Radchenko <[email protected]>
…implementation After these changes, in order to add Darwin bind-mount implementation, one only needs: * Adjust HasBindMounts definition in mount.go * Provide implementation in mount_darwin.go There was no consensus on adding dependency on bindfs, that seems to be the only working solution for bind-mounts on Darwin as of today, in containerd#8789, that's why the actual implementation is not added in current PR. As a bonus, Linux FUSE-related code was moved to a separate file and possibly could be reused on FreeBSD, though this needs testing. Signed-off-by: Marat Radchenko <[email protected]>
This is somewhat an alternative to #5935. I went a completely different route and leveraged FUSE bindfs that provides an equivalent of
mount --bind.This PR mostly consists of tossing functions around, so they better align with platforms. macOS-specific stuff happens in
mount/mount_darwin.go.As you can see there, I'm somewhat struggling with Go process API. If there are easier ways to do what I am trying to achieve (see comments), I'm all ears and really willing to simplify the code.This code is currently being used in rund - a very experimental and WIP containerd shim that allows running macOS inside "containers", lightly isolated from host machine via
chrootand (in future)sandbox-exec.Results of containerd tests:
UPD: actually, with this PR, macOS passes the whole
sudo make test test-root