wsl: report a single "all" device to kubelet by elezar · Pull Request #1671 · NVIDIA/k8s-device-plugin

elezar · 2026-03-21T19:11:32Z

On WSL, there is no isolation across different GPUs on a system. This is because they are all accessed through the same /dev/dgx device. This is reflected in in the CDI spec generated by the NVIDIA Container Toolkit to always generate a single all device.

This is incompatible with the device plugin when using a CDI-based device list strategy, since the device name reported by the plugin will include the device UUID or index.

The change in this PR ensures that the device plugin always reports a SINGLE device with a UUID and INDEX (all) so that this is compatible with the generated CDI spec.

In order to prepare for the WSL changes, we remove the tegra resource manager and pull the basic function implementations into the base type. This means that the base type is essentially a resource manager that does not support health checking and always uses distributed allocation. Signed-off-by: Evan Lezar <[email protected]>

On WSL, all GPUs are accessed through /dev/dxg. Replace the per-GPU wslDevice (which reported one device per physical GPU with individual UUIDs) with a stateless wslAllGPUsDevice that always returns UUID "all" and path "/dev/dxg". This causes the device map to collapse to a single entry per resource, so kubelet sees exactly one GPU device on WSL. When allocated, this flows naturally through all strategy paths (envvar, CDI, volume mounts) to set NVIDIA_VISIBLE_DEVICES=all, which is what nvidia-container-runtime on WSL expects. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Signed-off-by: Evan Lezar <[email protected]>

Use ghcr.io/nvidia/k8s-device-plugin:1bb36583 which includes upstream fixes for WSL2 CDI spec compatibility (cdiVersion and device naming), removing the need for any local spec transformation. See NVIDIA/k8s-device-plugin#1671. TODO: revert to chart-default image once a released version includes these fixes.

Use ghcr.io/nvidia/k8s-device-plugin:1bb36583 which includes upstream fixes for WSL2 CDI spec compatibility (cdiVersion and device naming), removing the need for any local spec transformation. See NVIDIA/k8s-device-plugin#1671. TODO: revert to chart-default image once a released version includes these fixes. Signed-off-by: Evan Lezar <[email protected]>

rahulait

Overall, LGTM. I don't have much experience with this, so would like someone else to approve as well.

cdesiniotis · 2026-04-15T15:50:53Z

Does this mean that a multi-gpu WSL2 node will only report having one nvidia.com/gpu allocatable resource?

elezar · 2026-04-15T15:55:42Z

Does this mean that a multi-gpu WSL2 node will only report having one nvidia.com/gpu allocatable resource?

Yes, that's what this means. Note the driver does not (or at least did not) support device-level isolation on WSL. Meaning that even IF there were multiple devices, any container would have access to all of them.

elezar · 2026-04-15T16:22:07Z

I was able to confirm that this works.

The node includes the following CDI spec:

$ nvidia-ctk cdi list
INFO[0000] Found 1 CDI devices
k8s.device-plugin.nvidia.com/gpu=all

And I am able to run a GPU pod and run nvidia-smi in it.

Use ghcr.io/nvidia/k8s-device-plugin:93042e1f which includes upstream fixes for WSL2 CDI spec compatibility (cdiVersion and device naming), removing the need for any local spec transformation. See NVIDIA/k8s-device-plugin#1671. TODO: revert to chart-default image once a released version includes these fixes. Signed-off-by: Evan Lezar <[email protected]>

elezar · 2026-04-17T15:03:28Z

/cherry-pick release-0.19

github-actions · 2026-04-17T15:03:50Z

🤖 Backport PR created for release-0.19: #1699 ✅

v0.19.1 includes WSL2 CDI spec compatibility fixes. See NVIDIA/k8s-device-plugin#1671.

v0.19.1 includes WSL2 CDI spec compatibility fixes. See NVIDIA/k8s-device-plugin#1671. Signed-off-by: Evan Lezar <[email protected]>

elezar force-pushed the wsl-single-all-device branch 2 times, most recently from b55dfe1 to 43b3086 Compare March 21, 2026 20:52

elezar mentioned this pull request Mar 22, 2026

fix(gpu): add WSL2 GPU support via CDI mode NVIDIA/OpenShell#441

Closed

elezar requested a review from cdesiniotis March 25, 2026 13:54

elezar mentioned this pull request Mar 25, 2026

feat(sandbox): add GPU sandbox support for WSL2 NVIDIA/OpenShell#608

Open

5 tasks

elezar and others added 2 commits April 15, 2026 10:00

elezar force-pushed the wsl-single-all-device branch from 43b3086 to 1bb3658 Compare April 15, 2026 08:07

elezar requested a review from rahulait April 15, 2026 08:25

rahulait approved these changes Apr 15, 2026

View reviewed changes

cdesiniotis approved these changes Apr 15, 2026

View reviewed changes

elezar merged commit eb98db4 into NVIDIA:main Apr 15, 2026
11 checks passed

elezar deleted the wsl-single-all-device branch April 15, 2026 16:23

github-actions Bot added the cherry-pick/release-0.19 label Apr 17, 2026

github-actions Bot mentioned this pull request Apr 17, 2026

[release-0.19] wsl: report a single "all" device to kubelet #1699

Merged

elezar mentioned this pull request Apr 20, 2026

Bump release v0.19.1 #1712

Merged

elezar added a commit to NVIDIA/OpenShell that referenced this pull request Apr 24, 2026

feat(gpu): bump device plugin to v0.19.1

c54a7cd

v0.19.1 includes WSL2 CDI spec compatibility fixes. See NVIDIA/k8s-device-plugin#1671.

elezar added a commit to NVIDIA/OpenShell that referenced this pull request Apr 24, 2026

feat(gpu): bump device plugin to v0.19.1

37cbb4d

v0.19.1 includes WSL2 CDI spec compatibility fixes. See NVIDIA/k8s-device-plugin#1671. Signed-off-by: Evan Lezar <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wsl: report a single "all" device to kubelet#1671

wsl: report a single "all" device to kubelet#1671
elezar merged 2 commits intoNVIDIA:mainfrom
elezar:wsl-single-all-device

elezar commented Mar 21, 2026 •

edited

Loading

Uh oh!

rahulait left a comment

Uh oh!

cdesiniotis commented Apr 15, 2026

Uh oh!

elezar commented Apr 15, 2026

Uh oh!

elezar commented Apr 15, 2026

Uh oh!

Uh oh!

elezar commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

elezar commented Mar 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rahulait left a comment

Choose a reason for hiding this comment

Uh oh!

cdesiniotis commented Apr 15, 2026

Uh oh!

elezar commented Apr 15, 2026

Uh oh!

elezar commented Apr 15, 2026

Uh oh!

Uh oh!

elezar commented Apr 17, 2026

Uh oh!

github-actions Bot commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

elezar commented Mar 21, 2026 •

edited

Loading