Description
I am working on runc, and we're experiencing occasional failures during the lazy migration test (opencontainers/runc#2924, opencontainers/runc#2760).
The nature of the lazy migration test requires the original container to stay until we finish restoring
the new one, meaning we have two cgroups (and systemd units, in case systemd is used).
I suspect the cause of failures we see is a race between runc+criu restoring a container into
the same cgroup as the original container was in, and systemd deciding to remove that cgroup.
I tried two ways of restoring into a different cgroup:
- Using the swrk analog of
--cgroup-root, described at https://criu.org/CGroups#Restoring_into_different_CGroups. Setting a different value(s) on restore apparently doesn't do anything.
- Putting the
criu swrk into a proper cgroup on restore, and using the swrk analog of --manage-cgroup-mode ignore, described at https://criu.org/CGroups#CGroups_restoring_strategy. As I understand, it should just skip dealing with cgroups (i.e. do not do anything related to cgroups).
I forgot the details but from what I remember neither method work (at least on cgroup v2).
I will do more tests and add further details tomorrow.
Description
I am working on runc, and we're experiencing occasional failures during the lazy migration test (opencontainers/runc#2924, opencontainers/runc#2760).
The nature of the lazy migration test requires the original container to stay until we finish restoring
the new one, meaning we have two cgroups (and systemd units, in case systemd is used).
I suspect the cause of failures we see is a race between runc+criu restoring a container into
the same cgroup as the original container was in, and systemd deciding to remove that cgroup.
I tried two ways of restoring into a different cgroup:
--cgroup-root, described at https://criu.org/CGroups#Restoring_into_different_CGroups. Setting a different value(s) on restore apparently doesn't do anything.criu swrkinto a proper cgroup on restore, and using the swrk analog of--manage-cgroup-mode ignore, described at https://criu.org/CGroups#CGroups_restoring_strategy. As I understand, it should just skip dealing with cgroups (i.e. do not do anything related to cgroups).I forgot the details but from what I remember neither method work (at least on cgroup v2).
I will do more tests and add further details tomorrow.