Description
nvidia-container-runtime doesn't work with rootless mode, because cgroup is not supported in rootless mode (yet).
$ docker -H unix:///run/user/1001/docker.sock run -it --rm --runtime=nvidia nvidia/cuda
...
docker: Error response from daemon: OCI runtime create failed: container_linux.go:344: starting container process caused "process_linux.go:424: container init caused \"process_linux.go:407: running prestart hook 1 caused \\\"error running hook: exit status 1, stdout: , stderr: exec command: [/usr/bin/nvidia-container-cli --load-kmods configure --ldconfig=@/sbin/ldconfig --device=all --compute --utility --require=cuda>=10.0 brand=tesla,driver>=384,driver<385 --pid=548 /home/suda/.local/share/docker/vfs/dir/122c480857379482f5317caaca55bbf5e43f84991accfa9f3ea586ed63f3fabd]\\\\nnvidia-container-cli: mount error: open failed: /sys/fs/cgroup/devices/user.slice/devices.allow: permission denied\\\\n\\\"\"": unknown.
ERRO[0142] error waiting for container: context canceled
I'm wondering we can use bind-mount instead of device cgroup, but haven't looked into deeper yet.
Steps to reproduce the issue:
See above
Describe the results you received:
Failed as above
Describe the results you expected:
Should work
Additional information you deem important (e.g. issue happens only occasionally):
Output of docker version:
Client: Docker Engine - Community
Version: 0.0.0-dev
API version: 1.40
Go version: go1.11.5
Git commit:
Built: Wed Feb 6 02:26:40 2019
OS/Arch: linux/amd64
Experimental: false
Server: Docker Engine - Community
Engine:
Version: 0.0.0-dev
API version: 1.40 (minimum version 1.12)
Go version: go1.11.5
Git commit: 273aef0a90
Built: Wed Feb 6 02:26:57 2019
OS/Arch: linux/amd64
Experimental: true
containerd:
Version: v1.2.2
GitCommit: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc:
Version: 1.0.0-rc6+dev
GitCommit: 96ec2177ae841256168fcf76954f7177af9446eb
docker-init:
Version: 0.18.0
GitCommit: fec3683
$ dpkg-query -s nvidia-container-runtime
Package: nvidia-container-runtime
Status: install ok installed
Priority: optional
Section: utils
Installed-Size: 7461
Maintainer: NVIDIA CORPORATION <[email protected]>
Architecture: amd64
Version: 2.0.0+docker18.09.2-1
Depends: libc6 (>= 2.14), libseccomp2 (>= 2.3.0), nvidia-container-runtime-hook (<< 2.0.0)
Description: NVIDIA container runtime
Provides a modified version of runc allowing users to run GPU enabled
containers.
Homepage: https://github.com/NVIDIA/nvidia-container-runtime/wiki
Output of docker info:
Client:
Debug Mode: false
Server:
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 1
Server Version: 0.0.0-dev
Storage Driver: vfs
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
Swarm: inactive
Runtimes: nvidia runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 9754871865f7fe2f4e74d43e2fc7ccd237edcbce
runc version: 96ec2177ae841256168fcf76954f7177af9446eb
init version: fec3683
Security Options:
seccomp
Profile: default
rootless
Kernel Version: 4.9.0-8-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 2
Total Memory: 7.309GiB
Name: suda-gpu
ID: 7O5L:MUNF:22VV:X3BE:CBR2:3NGG:LEES:LS75:AVCF:WQLH:CPA2:YAK2
Docker Root Dir: /home/suda/.local/share/docker
Debug Mode: false
Registry: https://index.docker.io/v1/
Labels:
Experimental: true
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Product License: Community Engine
WARNING: No swap limit support
WARNING: bridge-nf-call-iptables is disabled
WARNING: bridge-nf-call-ip6tables is disabled
Additional environment details (AWS, VirtualBox, physical, etc.):
Description
nvidia-container-runtime doesn't work with rootless mode, because cgroup is not supported in rootless mode (yet).
I'm wondering we can use bind-mount instead of device cgroup, but haven't looked into deeper yet.
Steps to reproduce the issue:
See above
Describe the results you received:
Failed as above
Describe the results you expected:
Should work
Additional information you deem important (e.g. issue happens only occasionally):
Output of
docker version:Output of
docker info:Additional environment details (AWS, VirtualBox, physical, etc.):