network: Move OCI hooks and network namespace creation out of virtcontainers#623
network: Move OCI hooks and network namespace creation out of virtcontainers#623bergwolf merged 5 commits intokata-containers:masterfrom
Conversation
|
@egernst @WeiZhang555 @bergwolf @amshinde @caoruidong PTAL FYI, this PR does not supersede #600, but shares part of the commits. The approach here is less aggressive as we're not dropping the network coldplug, but we let the possibility for the network not to be created if we don't want to (frakti case with the factory). We need to discuss if we want eventually to move to a fully hopluggable model, but this is beyond this PR scope. |
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
|
This is going to conflict with #617, so please could folk take a look at that one :) |
cli/create.go
Outdated
| } | ||
|
|
||
| // Execute prestart OCI hooks | ||
| if err := enterNetNS(sandboxConfig.NetworkConfig.NetNSPath, func() error { |
There was a problem hiding this comment.
This is a bit of a mind-bender :) I think it would be clearer if you assigned to err and then checked err on a separate line.
| "github.com/kata-containers/runtime/virtcontainers/device/api" | ||
| ) | ||
|
|
||
| type noopResourceStorage struct{} |
There was a problem hiding this comment.
Nice. Since you've added this though, how about splitting filesystem.go into:
resource_storage.go(type resourceStorage interfaceonly).resource_storage_filesystem.go
And renaming filesystem_test.go to resource_storage_filesystem_test.go ?
There was a problem hiding this comment.
Yes, this makes sense, let me take care of this.
There was a problem hiding this comment.
addressed in other PR (already merged)
cli/hook.go
Outdated
| return nil | ||
| } | ||
|
|
||
| func postStartHooks(spec oci.CompatOCISpec, cid, bundlePath string) error { |
There was a problem hiding this comment.
This function and postStopHooks() are almost identical to preStartHooks(), so how about having them all call a generic function. Something like:
func runHooks(cid, bundlePath string, hooks []Hook, name string) error { }
func preStartHooks(spec oci.CompatOCISpec, cid, bundlePath string) error {
if spec.Hooks == nil {
return nil
}
return runHooks(cid, bundlePath, spec.Hooks.Prestart, "pre-start")
}
cli/start.go
Outdated
|
|
||
| if err := enterNetNS(s.GetNetNs(), func() error { | ||
| return postStartHooks(ociSpec, sandboxID, status.Annotations[vcAnnot.BundlePathKey]) | ||
| }); err != nil { |
There was a problem hiding this comment.
Same comment as above - this is compact, but rather difficult to understand ;)
virtcontainers/default_network.go
Outdated
| type defNetwork struct { | ||
| } | ||
|
|
||
| func defNetworkLogger() *logrus.Entry { |
There was a problem hiding this comment.
Why not?:
func (n *defNetwork) logger() *logrus.Entry
| return endpoint, nil | ||
| } | ||
|
|
||
| func generateInterfacesAndRoutes(networkNS NetworkNamespace) ([]*grpc.Interface, []*grpc.Route, error) { |
There was a problem hiding this comment.
I'm pretty surprised the linters don't complain about the length/complexity of this function ;)
Codecov Report
@@ Coverage Diff @@
## master #623 +/- ##
=========================================
Coverage ? 64.92%
=========================================
Files ? 85
Lines ? 9589
Branches ? 0
=========================================
Hits ? 6226
Misses ? 2717
Partials ? 646 |
|
PSS Measurement: Memory inside container: |
|
Build failed (third-party-check pipeline) integration testing with
|
| // CNMNetworkModel is the CNM network. | ||
| CNMNetworkModel NetworkModel = "CNM" | ||
| // DefaultNetworkModel is the default network. | ||
| DefaultNetworkModel NetworkModel = "default" |
There was a problem hiding this comment.
Having a defaultNetworkModel and just a Noop model seems a bit awkward now? Really we're either having a network or not at this point, from what I can see? Maybe just a better name than 'default?'
There was a problem hiding this comment.
Yeah tell me which name I could use, but I agree "default" is definitely not great...
A bit more background here, initially I wanted to remove the interface since it was not really needed as we now have only one real implementation, but I realized that our unit testing would be way more complicated/impossible? if we decide to remove the noop implementation.
There was a problem hiding this comment.
@egernst I think naming the network to Default is a bit awkward, lets rework this in a separate PR if required.
|
This is a big PR - nice work. I feel like some of the commits would be better off in a seperate PR? In particular: And probably: |
|
Most of changes are code movement, generally good |
bergwolf
left a comment
There was a problem hiding this comment.
While I see this is mostly moving codes from virtcontainers to cli, there are a couple of issues that we should address IMHO.
cli/create.go
Outdated
|
|
||
| // Execute prestart OCI hooks | ||
| if err := enterNetNS(sandboxConfig.NetworkConfig.NetNSPath, func() error { | ||
| return preStartHooks(ociSpec, containerID, bundlePath) |
There was a problem hiding this comment.
This assumes that only the very first container of a sandbox has prestart hooks. However, OCI hooks are part of the OCI spec that every container can include. So IMO we should call these hooks for every container instead of just on the first container of a sandbox.
There was a problem hiding this comment.
This makes sense, even if I don't expect any hook for containers following the sandbox container.
cli/delete.go
Outdated
|
|
||
| switch containerType { | ||
| case vc.PodSandbox: | ||
| if err := postStopHooks(ociSpec, sandboxID, status.Annotations[vcAnnot.BundlePathKey]); err != nil { |
There was a problem hiding this comment.
Same here. It only runs the post stop hooks of the last container of a sandbox.
Also post stop hooks are supposed to run after a container is stopped. However, here we might have called them before the container is stopped.
There was a problem hiding this comment.
Oh yes good point, I have to move this slightly further in the code.
|
|
||
| var stdout, stderr bytes.Buffer | ||
| cmd := &exec.Cmd{ | ||
| Path: hook.Path, |
There was a problem hiding this comment.
I think we must validate the hook commands (e.g., with a configurable command binary whitelist) before really executing any of them. For example, imagine a public container cloud provider that runs kata containers inside, and then here comes a user who creates a container with rm -rf / in the OCI spec hook. BOOM!
It looks like runc does not handle it either. But kata is supposed to be more secure than runc. :)
There was a problem hiding this comment.
I am trying not to introduce too much new code, simply doing a clean move from virtcontainers to the CLI. I think your proposal should go through a follow up PR.
| return endpoint, nil | ||
| } | ||
|
|
||
| func generateInterfacesAndRoutes(networkNS NetworkNamespace) ([]*grpc.Interface, []*grpc.Route, error) { |
There was a problem hiding this comment.
This code is very Kata agent specific since it deals with grpc and not applicable for other agents. So I think it should be it in kata agent.
There was a problem hiding this comment.
I understand, but if you look at the Kata API, we're using both grpc.Interface and grpc.Route structures as the generic structures to add interfaces or routes.
I think moving this function out of kata_agent.go is the first step, and the second logical step would be to move those very generic structure definitions out of kata-containers/agent/protocols/grpc. It'd might be better to have them in a separate package so that they don't get associated with the kata-agent.
|
|
||
| var stdout, stderr bytes.Buffer | ||
| cmd := &exec.Cmd{ | ||
| Path: hook.Path, |
|
Rebased! I'll address the comments now. |
Since we removed the CNI implementation and that we agreed the network should only be handled in a single way from virtcontainers, this patch logically replace the "CNM" naming with "Default". Signed-off-by: Sebastien Boeuf <[email protected]>
This commit moves the network namespace creation out of virtcontainers in order to anticipate the move of the OCI hooks to the CLI through a follow up commit. Signed-off-by: Sebastien Boeuf <[email protected]>
As we want to call the OCI hook from the CLI, we need a way for the CLI to figure out what is the network namespace used by the sandbox. This is needed particularly because virtcontainers creates the netns if none was provided. Signed-off-by: Sebastien Boeuf <[email protected]>
03d731a to
39cfbba
Compare
|
@jodh-intel all your comments addressed. The rebase on top of the opentracing was trickier than I thought it'd be :) |
The CLI being the implementation of the OCI specification, and the hooks being OCI specific, it makes sense to move the handling of any OCI hooks to the CLI level. This changes allows the Kata API to become OCI agnostic. Fixes kata-containers#599 Signed-off-by: Sebastien Boeuf <[email protected]>
If the sandbox has been initialized with a factory, this means the caller should be in charge of adding any network to the VM, and virtcontainers library cannot make any assumptions about adding the default underlying network. Signed-off-by: Sebastien Boeuf <[email protected]>
|
@bergwolf I have addressed your comments, PTAL |
|
Build failed (third-party-check pipeline) integration testing with
|
|
PSS Measurement: Memory inside container: |
bergwolf
left a comment
There was a problem hiding this comment.
Looks good overall. Just have one comment about the place to call prestart hooks.
|
|
||
| // Run pre-start OCI hooks. | ||
| err = enterNetNS(sandboxConfig.NetworkConfig.NetNSPath, func() error { | ||
| return preStartHooks(ctx, ociSpec, containerID, bundlePath) |
There was a problem hiding this comment.
While it can work, I wonder why we are not calling it in start.go? After all, they are preStart hooks not preCreate hooks.
There was a problem hiding this comment.
@bergwolf This is what the OCI spec calls them. We have simply chosen to call it in create as we need the network hook to be called. @sboeuf and I had a discussion about moving to a hotplug model, but lets address that in a separate PR.
I also believe we need to evaluate if there are certain hooks that need to be run inside the VM.
There was a problem hiding this comment.
I think it's worth pointing out that both the current code, and this PR are somewhat abusing the standard:
|
@egernst Can you take a look at this? I would like to get this merged if it looks good to you. |
|
change is fine with me, I'll want to get @bergwolf 's ACK. I'd prefer a less awkward naming scheme (defaultNetwork when we only have one network is a bit odd), but don't want to delay this too much longer when I'm not providing a better alternative... |
|
|
||
| // Run pre-start OCI hooks. | ||
| err = enterNetNS(sandboxConfig.NetworkConfig.NetNSPath, func() error { | ||
| return preStartHooks(ctx, ociSpec, containerID, bundlePath) |
|
Well, it looks like |
|
Thanks for merging this ;) |
Routes with proto kernel are automatically added by the kernel when an interface is added with ip assigned as a subnet address. With github.com/kata-containers#1936, these routes will no longer be sent from the runtime. Depends-on: github.com/kata-containers#1936 Fixes: kata-containers#623 Signed-off-by: Archana Shinde <[email protected]>
Before we decide to move to a fully hotpluggable network model, we want to go step by step by making sure we can still support the network being coldplugged in case of
kata-runtime, but also letting the possibility to consumer of the API (such asfrakti) to hotplug the network interfaces, without having virtcontainers doing some prework there.This PR introduces a few rework and fixes, but the most important part is about the creation of the network namespace and the execution of the OCI hooks being handled directly from the CLI. This allows the Kata API to be OCI agnostic.
Fixes #599