Skip to content

ctr: implement CDI support#6204

Closed
bart0sh wants to merge 1 commit intocontainerd:mainfrom
bart0sh:PR002-ctr-support-CDI
Closed

ctr: implement CDI support#6204
bart0sh wants to merge 1 commit intocontainerd:mainfrom
bart0sh:PR002-ctr-support-CDI

Conversation

@bart0sh
Copy link
Copy Markdown
Contributor

@bart0sh bart0sh commented Nov 4, 2021

Implement support of the Container Device Interface (CDI) in ctr.

CDI goal is to update device-related information (devices, mounts, hooks, environment variables etc) in the OCI spec from the JSON files provided by device vendors. That would allow workloads to access properly configured devices.

The plan is to have CDI support in at least containerd, cri-o and podman. It's already implemented in podman: containers/podman#10081

Here is a typical usage scenario:

  • vendor provides a spec update in the CDI JSON format, e.g:
$ cat /etc/cdi/test.json 
{
  "cdiVersion": "0.2.0",
  "kind": "vendor.com/device",
  "devices": [
    {
      "name": "myDevice",
      "containerEdits": {
        "deviceNodes": [
          {"path": "/dev/device1", "type": "b", "major": 7, "minor": 0, "fileMode": 432, "permissions": "rw", "uid": 1000, "gid": 1000}
        ]
      }
    }
  ],
  "containerEdits": {
    "env": [
      "FOO=VALID_SPEC",
      "BAR=BARVALUE1"
    ],
    "deviceNodes": [
      {"path": "/dev/device1", "type": "b", "major": 7, "minor": 0, "fileMode": 432, "permissions": "rw", "uid": 1000, "gid": 1000}
    ],
    "mounts": [
       {"hostPath": "/bin/vendorBin", "containerPath": "/bin/vendorBin", "options": ["bind"]}
    ],
    "hooks": [
       {"hookName": "startContainer", "path": "/bin/vendorHook", "args": ["/bin/vendorHook", "param1", "param2"]}
    ]
  }
}
  • OCI spec gets updated by the ctr/containerd/other runtime and passed to the OCI runtime
  • workload can access the device inside a container

@k8s-ci-robot
Copy link
Copy Markdown

Hi @bart0sh. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Nov 4, 2021

Build succeeded.

@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Nov 4, 2021

I can't reproduce any of above CI failures locally. Can anybody point me out how to fix those?

@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Nov 4, 2021

/cc @mikebrow @kad @klihub

@k8s-ci-robot k8s-ci-robot requested a review from mikebrow November 4, 2021 13:33
@k8s-ci-robot
Copy link
Copy Markdown

@bart0sh: GitHub didn't allow me to request PR reviews from the following users: kad, klihub.

Note that only containerd members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @mikebrow @kad @klihub

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@AkihiroSuda
Copy link
Copy Markdown
Member

Please explain the goal and usage

@AkihiroSuda AkihiroSuda added the status/more-info-needed Awaiting contributor information label Nov 4, 2021
@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Nov 5, 2021

@AkihiroSuda done, please review.

@AkihiroSuda AkihiroSuda added status/needs-discussion Needs discussion and decision from maintainers and removed status/more-info-needed Awaiting contributor information labels Nov 5, 2021
@AkihiroSuda
Copy link
Copy Markdown
Member

@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Nov 5, 2021

@AkihiroSuda We can, but it's much longer term plan, I'd say.

@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Nov 5, 2021

@AkihiroSuda More details on this.
This code https://github.com/klihub/nri/blob/5f41c0fbfab8c026f94db5e3f3dc91ce9ff509bf/v2alpha1/plugins/cdi-resolver/cdi-resolver.go#L48-L60 is not merged yet and it would take some time to get it merged.

Even if it's merged it would cover only CRI use case. Covering other types of clients (ctr, Docker, etc) would require even more work and there is no guarantee when/if it will be done.

Comment thread oci/spec_opts_linux.go Outdated
},
cli.StringSliceFlag{
Name: "cdi",
Usage: "CDI device in the vendor.com/device=<device name> format to add to the container",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: The usage string seems to imply that the accepted CDI devices names are restricted to the format mentioned. However, I don't see this being checked in the code and AFAIK the current CDI implementation does not implement such a restriction either.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be done by CDI in my opinion. I'll create an issue about it.

@klihub
Copy link
Copy Markdown
Member

klihub commented Nov 8, 2021

I can't reproduce any of above CI failures locally. Can anybody point me out how to fix those?

Isn't at least one of those failures related to cross-builds (darwin) not finding the newly added WithCDIdevices() function because it is put in spec_opts_linux.go ?

@bart0sh bart0sh force-pushed the PR002-ctr-support-CDI branch from 1f41629 to 8a77bcc Compare November 9, 2021 11:18
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Nov 9, 2021

Build succeeded.

@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Nov 9, 2021

Isn't at least one of those failures related to cross-builds (darwin) not finding the newly added WithCDIdevices() function because it is put in spec_opts_linux.go ?

Moved it to the spec_opts_unix.go, but it didn't help for some reason.

@klihub
Copy link
Copy Markdown
Member

klihub commented Nov 9, 2021

Isn't at least one of those failures related to cross-builds (darwin) not finding the newly added WithCDIdevices() function because it is put in spec_opts_linux.go ?

Moved it to the spec_opts_unix.go, but it didn't help for some reason.

Yes, but now it fails on linux because spec_opts_unix.go is tagged // +build !linux,!windows. I guess you need to have a real implementation in both spec_opts_{linux,unix}.go, and then probably have a stub that just errors out for spec_opts_windows.go.

@AkihiroSuda
Copy link
Copy Markdown
Member

@AkihiroSuda More details on this. This code https://github.com/klihub/nri/blob/5f41c0fbfab8c026f94db5e3f3dc91ce9ff509bf/v2alpha1/plugins/cdi-resolver/cdi-resolver.go#L48-L60 is not merged yet and it would take some time to get it merged.

The cdi-resolver PR seems marked as "ready for review" recently. containerd/nri#16

Even if it's merged it would cover only CRI use case. Covering other types of clients (ctr, Docker, etc) would require even more work and there is no guarantee when/if it will be done.

We can easily extend ctr (and/or nerdctl) to support NRI.
I'm not sure when Docker/Moby can support NRI, but that question applies to CDI too :)

@bart0sh bart0sh force-pushed the PR002-ctr-support-CDI branch from 8a77bcc to a504135 Compare November 11, 2021 10:46
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Nov 11, 2021

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.

@bart0sh bart0sh force-pushed the PR002-ctr-support-CDI branch from a504135 to 5e80a68 Compare November 11, 2021 10:49
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Nov 11, 2021

Build succeeded.

@bart0sh bart0sh force-pushed the PR002-ctr-support-CDI branch from 5e80a68 to 8864a9c Compare November 11, 2021 13:22
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Nov 11, 2021

Build succeeded.

Signed-off-by: Ed Bartosh <[email protected]>
@bart0sh bart0sh force-pushed the PR002-ctr-support-CDI branch from 8864a9c to 9855017 Compare November 11, 2021 15:57
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Nov 11, 2021

Build succeeded.

@adrianchiris
Copy link
Copy Markdown

I didnt find any related issue in containerd to support CDI just this PR to comment so posting here.

Just wanted to add a +1 for adding support for CDI in containerd.

we have a use-case to inject RDMA [1] resources to containers to allow applications to use RDMA.
using CDI we could provide additional mounts to expose those resources (linux character devices)

[1] https://community.mellanox.com/s/article/What-is-RDMA

@kzys
Copy link
Copy Markdown
Member

kzys commented Mar 11, 2022

Both #6654 and this PR adds CDI into containerd's dependencies. We probably should review one of them first. Maybe #6654?

@bart0sh
Copy link
Copy Markdown
Contributor Author

bart0sh commented Mar 20, 2022

@kzys yes, #6654 is more important. I'm closing this one to avoid confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

needs-ok-to-test status/needs-discussion Needs discussion and decision from maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants