Description
We have clusters that have a manually created ingress network created with a subnet 10.0.0.1/16. While this is not a valid subnet block, we never had any issues with it.
Starting with #46183 it seems like validation upon container creation was improved. This however inadvertently caused our clusters to break upon upgrade.
I know the proper fix would be to recreate the ingress with 10.0.0.0/16, but that might not be the best course of action for production clusters.
Reproduce
- Create Cluster on Docker 24.x
- Remove ingress network
docker network rm ingress
- Create ingress network with broken subnet range
docker network create --driver overlay --ingress ingress --subnet=10.0.0.1/16
- Create service using the network
- Upgrade cluster to 25.x
- see that service is not being brought up anymore
Expected behavior
Existing (overlay) networks should not stop working when upgrading Docker
docker version
Client: Docker Engine - Community
Version: 25.0.2
API version: 1.44
Go version: go1.21.6
Git commit: 29cf629
Built: Thu Feb 1 00:23:19 2024
OS/Arch: linux/amd64
Context: default
Server: Docker Engine - Community
Engine:
Version: 25.0.2
API version: 1.44 (minimum version 1.24)
Go version: go1.21.6
Git commit: fce6e0c
Built: Thu Feb 1 00:23:19 2024
OS/Arch: linux/amd64
Experimental: false
containerd:
Version: 1.6.28
GitCommit: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
runc:
Version: 1.1.12
GitCommit: v1.1.12-0-g51d5e94
docker-init:
Version: 0.19.0
GitCommit: de40ad0
docker info
Client: Docker Engine - Community
Version: 25.0.2
Context: default
Debug Mode: false
Plugins:
buildx: Docker Buildx (Docker Inc.)
Version: v0.12.1
Path: /usr/libexec/docker/cli-plugins/docker-buildx
compose: Docker Compose (Docker Inc.)
Version: v2.24.5
Path: /usr/libexec/docker/cli-plugins/docker-compose
scan: Docker Scan (Docker Inc.)
Version: v0.17.0
Path: /usr/libexec/docker/cli-plugins/docker-scan
Server:
Containers: 57
Running: 20
Paused: 0
Stopped: 37
Images: 29
Server Version: 25.0.2
Storage Driver: overlay2
Backing Filesystem: extfs
Supports d_type: true
Using metacopy: false
Native Overlay Diff: true
userxattr: false
Logging Driver: syslog
Cgroup Driver: cgroupfs
Cgroup Version: 1
Plugins:
Volume: local
Network: bridge host ipvlan macvlan null overlay
Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
Swarm: active
NodeID: c20vpmwul4vai14swgkpbj8jd
Is Manager: true
ClusterID: ljgyd4c405vfzvwq339oif3gt
Managers: 3
Nodes: 3
Default Address Pool: 10.0.0.0/8
SubnetSize: 24
Data Path Port: 4789
Orchestration:
Task History Retention Limit: 5
Raft:
Snapshot Interval: 10000
Number of Old Snapshots to Retain: 0
Heartbeat Tick: 1
Election Tick: 10
Dispatcher:
Heartbeat Period: 5 seconds
CA Configuration:
Expiry Duration: 3 months
Force Rotate: 0
Autolock Managers: false
Root Rotation In Progress: false
Node Address: 192.168.0.4
Manager Addresses:
192.168.0.4:2377
192.168.0.5:2377
192.168.0.6:2377
Runtimes: io.containerd.runc.v2 runc
Default Runtime: runc
Init Binary: docker-init
containerd version: ae07eda36dd25f8a1b98dfbf587313b99c0190bb
runc version: v1.1.12-0-g51d5e94
init version: de40ad0
Security Options:
apparmor
seccomp
Profile: builtin
Kernel Version: 5.4.0-170-generic
Operating System: Ubuntu 20.04.6 LTS
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 7.574GiB
Name: precisionx-swarm-prod-01
ID: PULU:MERA:NH2U:EGMA:ZC5P:AMYT:Y7UB:M6RI:7VJ6:Q5BE:EAMZ:UFLP
Docker Root Dir: /var/lib/docker
Debug Mode: false
Experimental: false
Insecure Registries:
127.0.0.0/8
Live Restore Enabled: false
Default Address Pools:
Base: 172.18.0.0/16, Size: 24
WARNING: No swap limit support
Additional Info
No response
Description
We have clusters that have a manually created ingress network created with a subnet 10.0.0.1/16. While this is not a valid subnet block, we never had any issues with it.
Starting with #46183 it seems like validation upon container creation was improved. This however inadvertently caused our clusters to break upon upgrade.
I know the proper fix would be to recreate the ingress with 10.0.0.0/16, but that might not be the best course of action for production clusters.
Reproduce
docker network rm ingressdocker network create --driver overlay --ingress ingress --subnet=10.0.0.1/16Expected behavior
Existing (overlay) networks should not stop working when upgrading Docker
docker version
Client: Docker Engine - Community Version: 25.0.2 API version: 1.44 Go version: go1.21.6 Git commit: 29cf629 Built: Thu Feb 1 00:23:19 2024 OS/Arch: linux/amd64 Context: default Server: Docker Engine - Community Engine: Version: 25.0.2 API version: 1.44 (minimum version 1.24) Go version: go1.21.6 Git commit: fce6e0c Built: Thu Feb 1 00:23:19 2024 OS/Arch: linux/amd64 Experimental: false containerd: Version: 1.6.28 GitCommit: ae07eda36dd25f8a1b98dfbf587313b99c0190bb runc: Version: 1.1.12 GitCommit: v1.1.12-0-g51d5e94 docker-init: Version: 0.19.0 GitCommit: de40ad0docker info
Additional Info
No response