Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,9 @@ Metadata flags:
Shared memory flags:
- :whale: `--shm-size`: Size of `/dev/shm`

GPU flags:
- :whale: `--gpus`: GPU devices to add to the container ('all' to pass all GPUs). Please see also [./docs/gpu.md](./docs/gpu.md) for details.

Other `docker run` flags are on plan but unimplemented yet.
<details>
<summary> Clicke here to show all the `docker run` flags (Docker 20.10)</summary>
Expand Down Expand Up @@ -899,6 +902,7 @@ Others:
# Additional documents
- [`./docs/compose.md`](./docs/compose.md): Compose
- [`./docs/dir.md`](./docs/dir.md): Directory layout (`/var/lib/nerdctl`)
- [`./docs/gpu.md`](./docs/gpu.md): Using GPUs inside containers
- [`./docs/registry.md`](./docs/registry.md): Registry authentication (`~/.docker/config.json`)
- [`./docs/rootless.md`](./docs/rootless.md): Rootless mode
- [`./docs/stargz.md`](./docs/stargz.md): Lazy-pulling using Stargz Snapshotter
Expand Down
61 changes: 61 additions & 0 deletions docs/gpu.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Using GPUs inside containers

nerdctl provides docker-compatible NVIDIA GPU support.

## Prerequisites

- NVIDIA Drivers
- Same requirement as when you use GPUs on Docker. For details, please refer to [the doc by NVIDIA](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#pre-requisites).
- `nvidia-container-cli`
- containerd relies on this CLI for setting up GPUs inside container. You can install this via [`libnvidia-container` package](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html#libnvidia-container).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try rootless (on cgroup v1)?

I guess it needs setting no-cgroups = true
moby/moby#38729 (comment)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work as of now.

We might need to patch github.com/containerd/containerd/contrib/nvidia for allowing to pass --no-cgroup option to nvidia-container-cli.

Containerd doesn't use nvidia-container-runtime (instead, it executes nvidia-container-cli directly) so we cannot use /etc/nvidia-container-runtime/config.toml for nerdctl.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A very hacky workaround for this is to wrap nvidia-container-cli to forcefully specify --no-cgroups.

mkdir -p /opt/nvidia/bin
mv /usr/bin/nvidia-container-cli /opt/nvidia/bin/
cat <<'EOF' > /usr/bin/nvidia-container-cli
#!/bin/bash
/opt/nvidia/bin/nvidia-container-cli ${@:1:($#-1)} --no-cgroups ${@:$#}
EOF

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened containerd/containerd#5603 for discussion

Copy link
Copy Markdown
Member Author

@ktock ktock Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AkihiroSuda containerd/containerd#5604 is merged.
Updated this PR to use --no-cgroup and now it works in rootless environment as well (without any additional configurations to /etc/nvidia-container-runtime/config.toml, etc.).

replace directive is needed in go.mod to forcefully point to the latest commit of containerd.


## Options for `nerdctl run --gpus`

`nerdctl run --gpus` is compatible to [`docker run --gpus`](https://docs.docker.com/engine/reference/commandline/run/#access-an-nvidia-gpu).

You can specify number of GPUs to use via `--gpus` option.
The following example exposes all available GPUs.

```
nerdctl run -it --rm --gpus all nvidia/cuda:9.0-base nvidia-smi
```

You can also pass detailed configuration to `--gpus` option as a list of key-value pairs. The following options are provided.

- `count`: number of GPUs to use. `all` exposes all available GPUs.
- `device`: IDs of GPUs to use. UUID or numbers of GPUs can be specified.
- `capabilities`: [Driver capabilities](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html#driver-capabilities). If unset, `utility` is used.

The following example exposes a specific GPU to the container.

```
nerdctl run -it --rm --gpus capabilities=utility,device=GPU-3a23c669-1f69-c64e-cf85-44e9b07e7a2a nvidia/cuda:9.0-base nvidia-smi
```

## Fields for `nerdctl compose`

`nerdctl compose` also supports GPUs following [compose-spec](https://github.com/compose-spec/compose-spec/blob/master/deploy.md#devices).

You can use GPUs on compose when you specify some of the following `capabilities` in `services.demo.deploy.resources.reservations.devices`.

- `gpu`
- `nvidia`
- all allowed capabilities for `nerdctl run --gpus`

Avaliable fields are the same as `nerdctl run --gpus`.

The following exposes all available GPUs to the container.

```
version: "3.8"
services:
demo:
image: nvidia/cuda:9.0-base
command: nvidia-smi
deploy:
resources:
reservations:
devices:
- capabilities: ["utility"]
count: all
```
7 changes: 5 additions & 2 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ require (
github.com/containernetworking/plugins v0.9.1
github.com/cyphar/filepath-securejoin v0.2.2
github.com/docker/cli v20.10.7+incompatible
github.com/docker/distribution v2.7.1+incompatible // indirect
github.com/docker/docker v20.10.7+incompatible
github.com/docker/go-connections v0.4.0
github.com/docker/go-units v0.4.0
Expand All @@ -27,13 +28,15 @@ require (
github.com/morikuni/aec v1.0.0 // indirect
github.com/opencontainers/go-digest v1.0.0
github.com/opencontainers/image-spec v1.0.1
github.com/opencontainers/runtime-spec v1.0.3-0.20210303205135-43e4633e40c1
github.com/opencontainers/runtime-spec v1.0.3-0.20210326190908-1c3f411f0417
github.com/pkg/errors v0.9.1
github.com/rootless-containers/rootlesskit v0.14.2
github.com/sirupsen/logrus v1.8.1
github.com/urfave/cli/v2 v2.3.0
golang.org/x/sync v0.0.0-20210220032951-036812b2e83c
golang.org/x/sys v0.0.0-20210420072515-93ed5bcd2bfe
golang.org/x/sys v0.0.0-20210426230700-d19ff857e887
golang.org/x/term v0.0.0-20210406210042-72f3dc4e9b72
gotest.tools/v3 v3.0.3
)

replace github.com/containerd/containerd => github.com/containerd/containerd v1.5.1-0.20210614183500-0a3a77bc4453
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why replace?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without replace, go mod tidy wants to point to v1.5.2.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😞

207 changes: 18 additions & 189 deletions go.sum

Large diffs are not rendered by default.

84 changes: 84 additions & 0 deletions pkg/composer/serviceparser/serviceparser.go
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,15 @@
package serviceparser

import (
"bytes"
"encoding/csv"
"fmt"
"path/filepath"
"strings"

"github.com/compose-spec/compose-go/types"
compose "github.com/compose-spec/compose-go/types"
"github.com/containerd/containerd/contrib/nvidia"
"github.com/containerd/containerd/identifiers"
"github.com/containerd/nerdctl/pkg/reflectutil"
"github.com/pkg/errors"
Expand Down Expand Up @@ -104,6 +107,7 @@ func warnUnknownFields(svc compose.ServiceConfig) {
}
if unknown := reflectutil.UnknownNonEmptyFields(svc.Deploy.Resources,
"Limits",
"Reservations",
); len(unknown) > 0 {
logrus.Warnf("Ignoring: service %s: deploy.resources: %+v", svc.Name, unknown)
}
Expand All @@ -115,6 +119,24 @@ func warnUnknownFields(svc compose.ServiceConfig) {
logrus.Warnf("Ignoring: service %s: deploy.resources.resources: %+v", svc.Name, unknown)
}
}
if svc.Deploy.Resources.Reservations != nil {
if unknown := reflectutil.UnknownNonEmptyFields(svc.Deploy.Resources.Reservations,
"Devices",
); len(unknown) > 0 {
logrus.Warnf("Ignoring: service %s: deploy.resources.resources.reservations: %+v", svc.Name, unknown)
}
for i, dev := range svc.Deploy.Resources.Reservations.Devices {
if unknown := reflectutil.UnknownNonEmptyFields(dev,
"Capabilities",
"Driver",
"Count",
"IDs",
); len(unknown) > 0 {
logrus.Warnf("Ignoring: service %s: deploy.resources.resources.reservations.devices[%d]: %+v",
svc.Name, i, unknown)
}
}
}
}

// unknown fields of Build is checked in parseBuild().
Expand Down Expand Up @@ -193,6 +215,60 @@ func getMemLimit(svc compose.ServiceConfig) (types.UnitBytes, error) {
return limit, nil
}

func getGPUs(svc compose.ServiceConfig) (reqs []string, _ error) {
// "gpu" and "nvidia" are also allowed capabilities (but not used as nvidia driver capabilities)
// https://github.com/moby/moby/blob/v20.10.7/daemon/nvidia_linux.go#L37
capset := map[string]struct{}{"gpu": {}, "nvidia": {}}
for _, c := range nvidia.AllCaps() {
capset[string(c)] = struct{}{}
}
if svc.Deploy != nil && svc.Deploy.Resources.Reservations != nil {
for _, dev := range svc.Deploy.Resources.Reservations.Devices {
if len(dev.Capabilities) == 0 {
// "capabilities" is required.
// https://github.com/compose-spec/compose-spec/blob/74b933db994109616580eab8f47bf2ba226e0faa/deploy.md#devices
return nil, fmt.Errorf("service %s: specifying \"capabilities\" is required for resource reservations", svc.Name)
}

var requiresGPU bool
for _, c := range dev.Capabilities {
if _, ok := capset[c]; ok {
requiresGPU = true
}
}
if !requiresGPU {
continue
}

var e []string
if len(dev.Capabilities) > 0 {
e = append(e, fmt.Sprintf("capabilities=%s", strings.Join(dev.Capabilities, ",")))
}
if dev.Driver != "" {
e = append(e, fmt.Sprintf("driver=%s", dev.Driver))
}
if len(dev.IDs) > 0 {
e = append(e, fmt.Sprintf("device=%s", strings.Join(dev.IDs, ",")))
}
if dev.Count != 0 {
e = append(e, fmt.Sprintf("count=%d", dev.Count))
}

Copy link
Copy Markdown
Member

@fahedouch fahedouch Jun 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

count and device_ids are mutually exclusive. we should define one field at a time. is it configured somewhere ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buf := new(bytes.Buffer)
w := csv.NewWriter(buf)
if err := w.Write(e); err != nil {
return nil, err
}
w.Flush()
o := buf.Bytes()
if len(o) > 0 {
reqs = append(reqs, string(o[:len(o)-1])) // remove carriage return
}
}
}
return reqs, nil
}

// getRestart returns `nerdctl run --restart` flag string ("no" or "always")
//
// restart: {"no" (default), "always", "on-failure", "unless-stopped"} (https://github.com/compose-spec/compose-spec/blob/167f207d0a8967df87c5ed757dbb1a2bb6025a1e/spec.md#restart)
Expand Down Expand Up @@ -400,6 +476,14 @@ func newContainer(project *compose.Project, parsed *Service, i int) (*Container,
c.RunArgs = append(c.RunArgs, fmt.Sprintf("-m=%d", memLimit))
}

if gpuReqs, err := getGPUs(svc); err != nil {
return nil, err
} else if len(gpuReqs) > 0 {
for _, gpus := range gpuReqs {
c.RunArgs = append(c.RunArgs, fmt.Sprintf("--gpus=%s", gpus))
}
}

for k, v := range svc.Labels {
if v == "" {
c.RunArgs = append(c.RunArgs, fmt.Sprintf("-l=%s", k))
Expand Down
16 changes: 16 additions & 0 deletions pkg/composer/serviceparser/serviceparser_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -198,11 +198,24 @@ services:
image: nginx:alpine
deploy:
restart_policy: {}
resources:
reservations:
devices:
- capabilities: ["gpu", "utility", "compute"]
driver: nvidia
count: 2
- capabilities: ["nvidia"]
device_ids: ["dummy", "dummy2"]
baz: # restart=no
image: nginx:alpine
deploy:
restart_policy:
condition: none
resources:
reservations:
devices:
- capabilities: ["utility"]
count: all
`
comp := testutil.NewComposeDir(t, dockerComposeYAML)
defer comp.CleanUp()
Expand Down Expand Up @@ -237,6 +250,8 @@ services:
assert.Assert(t, len(bar.Containers) == 1)
for _, c := range bar.Containers {
assert.Assert(t, in(c.RunArgs, "--restart=always"))
assert.Assert(t, in(c.RunArgs, `--gpus="capabilities=gpu,utility,compute",driver=nvidia,count=2`))
assert.Assert(t, in(c.RunArgs, `--gpus=capabilities=nvidia,"device=dummy,dummy2"`))
}

bazSvc, err := project.GetService("baz")
Expand All @@ -249,6 +264,7 @@ services:
assert.Assert(t, len(baz.Containers) == 1)
for _, c := range baz.Containers {
assert.Assert(t, in(c.RunArgs, "--restart=no"))
assert.Assert(t, in(c.RunArgs, `--gpus=capabilities=utility,count=-1`))
}
}

Expand Down
10 changes: 10 additions & 0 deletions run.go
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,10 @@ var runCommand = &cli.Command{
Name: "sysctl",
Usage: "Sysctl options",
},
&cli.StringSliceFlag{
Name: "gpus",
Usage: "GPU devices to add to the container ('all' to pass all GPUs)",
},
// volume flags
&cli.StringSliceFlag{
Name: "volume",
Expand Down Expand Up @@ -499,6 +503,12 @@ func runAction(clicontext *cli.Context) error {

opts = append(opts, WithSysctls(strutil.ConvertKVStringsToMap(clicontext.StringSlice("sysctl"))))

gpuOpt, err := parseGPUOpts(clicontext.StringSlice("gpus"))
if err != nil {
return err
}
opts = append(opts, gpuOpt...)

var s specs.Spec
spec := containerd.WithSpec(&s, opts...)
cOpts = append(cOpts, spec)
Expand Down
Loading