Skip to content

Proposal: Restore running container network settings for containerd integration to support hot upgrade #975

@coolljt0725

Description

@coolljt0725

Docker engine PR moby/moby#20662 try to integrate containerd for container
supervision, that's awesome. This will make it possible to upgrade the daemon without shutting down all running containers and docker daemon down will not affect the running containers any more, just restart docker daemon will restore all the previous running container. This also need the libnetwork to restore the container network settings(endpoints, sandbox, networks, portmapping). Currently, the daemon starting will clean up the network stuff(networks, endpoints, sandbox), so the ports, ip address, sandboxes of the old running containers are not aware of by the new daemon, the ip and the port still can be allocated to new containers.

I made some progress( see https://github.com/coolljt0725/libnetwork/tree/restore_network )on supporting this. Here is an example(docker binary build form branch https://github.com/coolljt0725/docker/tree/containerd-integration-network which based on PR moby/moby#20662):

  1. run a ngnix container with 80 port
$ docker run -d -ti -p 80:80 nginx
fbc3c1025f63c5429c7feae208b4794672d2c44ab5e0b638e0abfcc1d03c7451
[lei@centos-188 docker]$ docker inspect -f {{.NetworkSettings.Networks.bridge.IPAddress}} fbc3c1025f63c5429c7feae208b4794672d2c44ab5e0b638e0abfcc1d03c7451
172.17.0.2

and I can access the nginx server from my chrome
2. kill the docker daemon and restart it
$ sudo kill -9 $(cat /var/run/docker.pid)
3. after restart, we can see this container is still running.

$ docker ps
CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                         NAMES
fbc3c1025f63        nginx               "nginx -g 'daemon off"   5 minutes ago       Up 5 minutes        0.0.0.0:80->80/tcp, 443/tcp   jolly_albattani

we still can access the nginx server from chrome.
start a container and try to pushlish port 80 will failed because daemon know it has been allocate to nginx.
start any container, the ip 172.17.0.2 of nginx container will not be allocated again because daemon know it has been allocated.

I don't know if this is the right approach to implement this, I'm happy to open a PR to work on this

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions