runc/crun, cgroups and CRIU

I am currently looking at a problem concerning CRIU and OCI containers. My understanding so far is the following:

I am creating a checkpoint with `manage_cgroups` not set. This means we should have `opts.manage_cgroups = CG_MODE_DEFAULT` which is set to `#define CG_MODE_DEFAULT (CG_MODE_SOFT)`.

Creating a checkpoint CRIU still tracks the information about the cgroup of the process in the container.

My understanding is that this should not be necessary, as (crun at least) will move the process after restore in the new cgroup created by crun. I think this is the only right approach. CRIU should, in case of OCI containers, not touch the cgroup setting. If the container is restored it will be restored with a newly created cgroup by the container runtime (crun/runc).

Setting `#define CG_MODE_DEFAULT (CG_MODE_IGNORE)` I still get a cgroup.img and core-1.img references cgroups via `"cg_set": 2,`.

The restore fails with:
```
(00.003375)      1: cg: Move into 2
(00.003391)      1: cg: setting cgns prefix to /machine.slice/libpod-dd47c09e12569883f67d88a5da89cbd2e1c450b2f3803087ee72e3a062a05186.scope/container
(00.003415)      1: Error (criu/cgroup.c:1092): cg: Can't move 1 into unifie//machine.slice/libpod-dd47c09e12569883f67d88a5da89cbd2e1c450b2f3803087ee72e3a062a05186.scope/container/cgroup.procs (-1/-1): Bad file descriptor
(00.003427)      1: Error (criu/cgroup.c:1148): cg: couldn't set cgns prefix unifie//machine.slice/libpod-dd47c09e12569883f67d88a5da89cbd2e1c450b2f3803087ee72e3a062a05186.scope/container/cgroup.procs: Bad file descriptor
(00.003431)      1: Error (criu/cgroup.c:1171): cg: failed preparing cgns
```

So there is still a bug somewhere in the code because ` unifie//machine.slice` does not look correct.

Using CRIU's manage_cgroup mode will result in `CG_MODE_SOFT` and the restore works, but the restore does strange things. First of all I see in the logs:
```
(00.001357) cg: Preparing cgroups yard (cgroups restore mode 0x4)
(00.001593) cg: Opening .criu.cgyard.cifCa8 as cg yard
(00.001613) cg:         Making controller dir .criu.cgyard.cifCa8/unifie ()
(00.001707) cg: Determined cgroup dir unifie/machine.slice/libpod-30325b748276c463e9f5e8db0f98662915f7372f7585287dcae81c8cd4d75636.scope/container already exist
(00.001713) cg: Skip restoring properties on cgroup dir unifie/machine.slice/libpod-30325b748276c463e9f5e8db0f98662915f7372f7585287dcae81c8cd4d75636.scope/container
```
Which again looks wrong from the used paths and it is still referencing old cgroup paths although the container has another ID and the container runtime created another ID.

To reproduce:
```
podman run -d quay.io/adrianreber/counter
podman container checkpoint --latest --export /tmp/dump.tar -R -k
podman container restore -i /tmp/dump.tar -n new -k
```

Looking at the restore log of the container `new` will show the message from above. The log can be found with `podman inspect -l --format "{{.State.RestoreLog}}"`.

So this is actually a bug report that the cgroup handling is not correct from CRIU and also a question if CRIU should just completely ignore the cgroup settings when used in combination with crun/runc, because crun/runc will create a new cgroup for a new container and move the processes into it. Currently it does not seem possible to tell CRIU to completely ignore the cgroup even with `CG_MODE_IGNORE`.

@mihalicyn @avagin any ideas, suggestions or comments?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

runc/crun, cgroups and CRIU #1793

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

runc/crun, cgroups and CRIU #1793

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions