Windows Hyper-V Container Support For CRI

### What is the problem you're trying to solve

We'd like to support launching hypervisor isolated Windows containers through the CRI entry point to light up this scenario for K8s. There's support to launch Hyper-V containers present in Containerd itself via the [WithWindowsHyperV](https://github.com/containerd/containerd/blob/d4641e1ce1e07393115cd52bd71041ee8a99a180/oci/spec_opts.go#L1220) client option, as well as the ctr testing tools [–isolation](https://github.com/containerd/containerd/blob/f8585d632a65d252e3c19dd4259ac8f20d343f0e/cmd/ctr/commands/run/run_windows.go#L38) flag, however there is nothing in the CRI plugin that makes use of this functionality at the moment.



### Describe the solution you'd like

There's a few spots that would need to change to add in "full" support, but at least in the 1.7 timeframe for getting in the minimal amount needed to launch/manage these containers, there's not a great deal.

## Initial Support (1.7 timeframe)
### Filling in the HyperV runtime spec field 
The Windows Containerd shim exposes a [SandboxIsolation](https://github.com/microsoft/hcsshim/blob/master/cmd/containerd-shim-runhcs-v1/options/runhcs.proto#L37-L40) enum that can be used to tell the shim what kind of container/pod to launch. This field in combination with new runtime class definitions in Containerd is how we can differentiate between process and hypervisor isolation for Windows. Below is an example pod spec and runtime class definition in Containerds config file:

```yaml 
kind: Deployment
metadata:
  name: wcow-test
spec:
  replicas: 2
  selector:
    matchLabels:
      app: wcow
  template:
    metadata:
      labels:
        app: wcow
    spec:
      runtimeClassName: runhcs-wcow-hypervisor  <----------------
      containers:
      - name: servercore
        image: mcr.microsoft.com/windows/servercore:1809
        ports:
        - containerPort: 80
          protocol: TCP
```

```toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
    [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runhcs-wcow-hypervisor]
        base_runtime_spec = ""
        cni_conf_dir = ""
        cni_max_conf_num = 0
        container_annotations = []
        pod_annotations = []
        privileged_without_host_devices = false
        runtime_engine = ""
        runtime_path = ""
        runtime_root = ""
        runtime_type = "io.containerd.runhcs.v1"
        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runhcs-wcow-hypervisor.options]
          Debug = true
          DebugType = 2
          SandboxImage = "mcr.microsoft.com/windows/servercore:1809"
          SandboxPlatform = "windows/amd64"
          SandboxIsolation = 1 <-------------------
```

We can also additionally expand on what the default CRI config   can be in Containerd for Windows if not supplied in the config file. We would have to continually update this to include new runtimes anytime a new OS release/container image pair is made available.

```go
// DefaultConfig returns default configurations of CRI plugin.
func DefaultConfig() PluginConfig {
     //
     // New Additions
     //
    ws2019Opts := options.Options{
        SandboxImage:     "mcr.microsoft.com/windows/nanoserver:1809",
        SandboxPlatform:  "windows/amd64",
        SandboxIsolation: options.Options_HYPERVISOR,
    }
    ws2022Opts := options.Options{
        SandboxImage:     "mcr.microsoft.com/windows/nanoserver:ltsc2022",
        SandboxPlatform:  "windows/amd64",
        SandboxIsolation: options.Options_HYPERVISOR,
    }
    // 
    // End of new additions
    //
    return PluginConfig{
        CniConfig: CniConfig{
            NetworkPluginBinDir: filepath.Join(os.Getenv("ProgramFiles"), "containerd", "cni", "bin"),
            NetworkPluginConfDir: filepath.Join(os.Getenv("ProgramFiles"), "containerd", "cni", "conf"),
            NetworkPluginMaxConfNum:   1,
            NetworkPluginConfTemplate: "",
        },
        ContainerdConfig: ContainerdConfig{
            Snapshotter:        containerd.DefaultSnapshotter,
            DefaultRuntimeName: "runhcs-wcow-process",
            NoPivot:            false,
            Runtimes: map[string]Runtime{
                "runhcs-wcow-process": {
                    Type:                 "io.containerd.runhcs.v1",
                    ContainerAnnotations: []string{"io.microsoft.container.*"},
                },
                //
                // New additions
                //
                "runhcs-wcow-hypervisor-1809": {
                    Type:                 "io.containerd.runhcs.v1",
                    PodAnnotations:       []string{"io.microsoft.virtualmachine.*"},
                    ContainerAnnotations: []string{"io.microsoft.container.*"},
                    Options:              ws2019Opts,
                },
                "runhcs-wcow-hypervisor-17763": {
                    Type:                 "io.containerd.runhcs.v1",
                    PodAnnotations:       []string{"io.microsoft.virtualmachine.*"},
                    ContainerAnnotations: []string{"io.microsoft.container.*"},
                    Options:              ws2019Opts,
                },
                "runhcs-wcow-hypervisor-20348": {
                    Type:                 "io.containerd.runhcs.v1",
                    PodAnnotations:       []string{"io.microsoft.virtualmachine.*"},
                    ContainerAnnotations: []string{"io.microsoft.container.*"},
                    Options:              ws2022Opts,
                },
                "runhcs-wcow-hypervisor-21H2": {
                    Type:                 "io.containerd.runhcs.v1",
                    PodAnnotations:       []string{"io.microsoft.virtualmachine.*"},
                    ContainerAnnotations: []string{"io.microsoft.container.*"},
                    Options:              ws2022Opts,
                },
                //
                // End of new additions
                //
            },
        },
        … Omitted other fields …
    }
}
```

### Resource Limits For the VM
One way that the Windows shim supports setting resource limits (memory, vcpu count) for the lightweight VM is via [annotations](https://github.com/microsoft/hcsshim/blob/master/pkg/annotations/annotations.go). The virtual machine based annotations all begin with `io.microsoft.virtualmachine.*`, so playing into the last section above would be to allow these annotations via the `PodAnnotations` and `ContainerAnnotations` fields as shown. 

An example pod spec asking for the VM hosting the containers in the pod to boot with 4GB of memory and 4 vps is below:

```yaml
apiVersion: v1
kind: Pod
metadata:
  name: wcow-test
  labels:
        app: wcow
  annotations:
          io.microsoft.virtualmachine.computetopology.memory.sizeinmb: "4096"
          io.microsoft.virtualmachine.computetopology.processor.count: "4"
spec:
  replicas: 2
  selector:
    matchLabels:
      app: wcow  
    spec:
      runtimeClassName: runhcs-wcow-hypervisor  <----------------
      containers:
      - name: servercore
        image: mcr.microsoft.com/windows/servercore:1809
        ports:
        - containerPort: 80
          protocol: TCP
```

Another way resource limits could be set, although the values would be fixed for the duration of a deployment unless Containerd was restarted or the value was overrode by specifying an annotation, would be the [vm_process_count and vm_memory_size_in_mb](https://github.com/microsoft/hcsshim/blob/master/cmd/containerd-shim-runhcs-v1/options/runhcs.proto#L56-L62) fields that are present in the Windows shim specific options. 

This could be extended further by having the runtime class specify the resource limits in the name. For example runhcs-wcow-hypervisor-20348-1vp2gb:
```toml
[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runhcs-wcow-hypervisor-20348-1vp2gb.options]
    Debug = true
    DebugType = 2
    SandboxPlatform = "windows/amd64"
    SandboxIsolation = 1
    VmProcessorCount = 1
    VmMemorySizeInMb = 2048
```

### Testing 
This is tricky as Github actions runners don't support nested virtualization, we'll likely need to do something similar to the approach the Windows periodic tests use and allocate az vms to do our bidding (https://github.com/containerd/containerd/blob/main/.github/workflows/windows-periodic.yml). This might be the most work..

## "Full Support"
### Pulling images that don't match hosts build
One of the pros for Hyper-V containers is that you're not constrained to the Windows hosts build number for image choice (ws2019 host no longer has to only use a 1809/ws2019 image). However, the Windows platform matching code is finnicky and tough to get right, and the main selling point for these containers is really security. I'd be alright punting figuring out the platform package changes until we know what's the right approach, and just get in the work to be able to launch these in general. 

### Resource Limits Looking Forward
There's platform limitations to supporting vcpu hot-add, but ideally k8s would tally up the total resource limits by adding up the container resource limits in the pod and sending it in some field for Windows. If that does come to fruition then we'll need to do something with this data. Writing this down for future reference mainly

### Additional context

Thanks for reading the wall of text :)

## Tracking
### 1.7
- [x] [Fill in the HyperV runtime spec field](https://github.com/microsoft/hcsshim/pull/1388)
- [x] [Add runhcs-wcow-hypervisor runtimeclass to the default config](https://github.com/containerd/containerd/pull/6901)
- [x] Add new test runs for wcow-hypervisor support

### Future 
- [ ]  Support pulling images that don't match hosts build number
- [ ] Better way to handle resource limits for the VM (K8s tally the total like https://github.com/kubernetes/enhancements/tree/master/keps/sig-node/688-pod-overhead#container-runtime-interface-cri)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Windows Hyper-V Container Support For CRI #6862

What is the problem you're trying to solve

Describe the solution you'd like

Initial Support (1.7 timeframe)

Filling in the HyperV runtime spec field

Resource Limits For the VM

Testing

"Full Support"

Pulling images that don't match hosts build

Resource Limits Looking Forward

Additional context

Tracking

1.7

Future

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Windows Hyper-V Container Support For CRI #6862

Description

What is the problem you're trying to solve

Describe the solution you'd like

Initial Support (1.7 timeframe)

Filling in the HyperV runtime spec field

Resource Limits For the VM

Testing

"Full Support"

Pulling images that don't match hosts build

Resource Limits Looking Forward

Additional context

Tracking

1.7

Future

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions