Allow user to specify default address pools for docker networks#29376
Allow user to specify default address pools for docker networks#29376aboch wants to merge 2 commits intomoby:masterfrom aboch:ls
Conversation
|
This should probably also be configurable in the |
Yes. It is taken care automatically. I verified it works via |
|
@aboch |
Great! Looks like that needs some documentation as well in dockerd.md#daemon-configuration-file @aboch would it make sense to have this option for overlay networks as well? For people running on older kernel versions, it's important that the overlay IP-addresses don't overlap with IP-addresses on the underlay network; having an option to automatically pick a range that doesn't overlap could benefit those users as well? |
|
@zarbis I in fact added to the manual that the two options go hand-in-hand: I would like to keep this explicit, I would not add a default for the pool size if user did not pass one. I prefer start to fail and user to correct the config.
Thanks, I will take care of it
I thought of it, but I felt addressing the defaults for local networks was the priority given the number of requests and given how easy this problem can be hit. In the multihost scenario where host kernel version is < 3.13, user would anyway have to first make sure the chosen subnet for the overlay network does not overlap with any route on any of the cluster hosts, like to say reaching to the BTW, this change does not prevent us for addressing the defaults for global networks if we see the need in future, it would anyway be an addition of a couple of options to the UX. |
|
@zarbis In regards to your questions, I now see a point in not having two separate options ( @thaJeztah was suggesting me offline that we support the "csv" notation, so that we could represent this as something like: this is also future proof if we ever decide to support this for global scope network as well. I will rework the PR. |
|
Thanks @thaJeztah. I Reworked the PR to allow user to set both local and global scope default address pools, via csv options. PTAL when you get a chance. |
|
Will this most likely be in 1.14? |
|
What is the relationship between this feature and the current |
|
@estesp If you specify a The default pools are used when user does not specify a subnet for the network and only for IPv4 subnets for now. It does not deprecate any existing IPAM related options. It adds control over the default pools which today are hard coded. Read |
There was a problem hiding this comment.
One can configure --default-address-pools and decide to change it on a daemon reload. We can allow that change only if all the networks created before have been deleted. So the libnetwork controller has to be instantiated. But currently ipamutils is an independent package.
How about letting libnetwork controller get the --default-address-pools config, do the required validity checks and then call ipamutils.InitAddressPools with the new config or fail the controller creation if the validity checks fail ?
There was a problem hiding this comment.
I do not think we will allow the pools to be modified on reload.
Currently only two networking configs are allowed to be changed on cofig reload --cluster-store and --cluster-advertise.
Even if in future we decide to support the default pool re-config, I don't think we should remove the existing networks using the old pools. User is configuring a default, which takes place from this moment on, it is understandable existing networks will still be using the pool assigned to them at time of creation.
The reason why we have the default pool controlled in the ipamutils pkg itself, is because they need to be available to both libnetwork and swarm.
We could let libnetwork controller handle the the default pool config, then program the ipamutils pkg, but then how can we guarantee docker or swarm code does wait for the network controller to be up before invoking ipamutils methods ?
There was a problem hiding this comment.
I do not think we will allow the pools to be modified on reload.
I was referring to the case where the daemon is stopped and restarted with a new range. For modification through the config file, agree that we don't have to support it.
From an operator pov they want to configure the pool to avoid conflicts with the underlay networks or the host subnets.. so after configuring a range, the only reason one would modify the range is because some subnets in the range can no longer be used by docker.. so the user should have deleted the old networks before making this config change.. if he doesn’t its a user’s mistake..But unless its absolutely impossible for us to detect that misconfig and error out we should do it.. allowing that erroneous config can lead to hard to debug issues.
For the swarm mode we have to think about what is the best way to support this config. Since the swarm init is done independent of the daemon configs its more natural to associate the global range config with swarm init rather than the engine config. But it should be possible to do the validity checks in that case also because swarm mode will deal only with global range and the engine will only deal with local range.
There was a problem hiding this comment.
Given docker networks are persisted across reboot, I do not think the new default pools must be applied to existing networks, and I do not think there would be such an expectation.
The defaults take place when user does not specify the option at time of the resource creation. Once the resource is created, the allocated properties should stay for the life of the resource and not be redefined based on new defaults.
In the scenario you depicted, I would expect user to remove any network which may conflict with an underlying network anyway. Then he can reload with the new defaults and start creating networks again. On a related note, if user's environment is tightly dependent on underlaying network, I would expect him to explicitly choose the subnet when he creates the networks.
There was a problem hiding this comment.
@aboch I actually have an expectation of networks change after daemon reload.
Real life situation why I'm subbed to this PR: I've ran out of networks spinning project with Docker Compose through CI pipeline.
I expect to change daemon options, reload daemon and have my networks compacted and bunch of address space freed. It will be a life saver in OH SHI~ moments.
I understand that it introduces a lot of complication, but just wanted to point out that some people will have those expectations.
There was a problem hiding this comment.
We're constantly running out of networks, during local development and CI.
We're calling docker network ls -q | xargs docker rm -fv every 2 hours in our runners.
|
@aboch any updates here? |
|
ping @aboch still working on this? |
|
Fix here: tedwardd@e144a29 |
|
Would be so great if this could get merged. CC: @handcode |
|
+1000 Anything blocking this from being merged? Anything we can do to help? |
Signed-off-by: Alessandro Boch <[email protected]>
|
I see the branch was deleted, so in case it disappears from github history; From 8fa9a697642a2ec8f7588ac7114754fcd79eb9a8 Mon Sep 17 00:00:00 2001
From: Alessandro Boch <[email protected]>
Date: Tue, 13 Dec 2016 15:04:35 -0800
Subject: [PATCH 1/2] Placeholder for libnetwork changes
Signed-off-by: Alessandro Boch <[email protected]>
---
.../docker/libnetwork/ipamutils/utils.go | 104 ++++++++++++++++++++-
1 file changed, 101 insertions(+), 3 deletions(-)
diff --git a/vendor/github.com/docker/libnetwork/ipamutils/utils.go b/vendor/github.com/docker/libnetwork/ipamutils/utils.go
index bca4e9e519a..16bd79fb0ac 100644
--- a/vendor/github.com/docker/libnetwork/ipamutils/utils.go
+++ b/vendor/github.com/docker/libnetwork/ipamutils/utils.go
@@ -2,6 +2,7 @@
package ipamutils
import (
+ "fmt"
"net"
"sync"
)
@@ -13,11 +14,25 @@ var (
// PredefinedGranularNetworks contains a list of 64K IPv4 private networks with host size 8
// (10.x.x.x/24) which do not overlap with the networks in `PredefinedBroadNetworks`
PredefinedGranularNetworks []*net.IPNet
-
- initNetworksOnce sync.Once
+ // ErrPoolsAlreadyInitialized notifies the default pools are already set
+ ErrPoolsAlreadyInitialized = fmt.Errorf("predefined address pools are already initialized")
+ initNetworksOnce sync.Once
)
-// InitNetworks initializes the pre-defined networks used by the built-in IP allocator
+// PredefinedPools represent a set of address pools with prefix length Size.
+// Each pool in the set is derived from the Base pool. Base is to be passed
+// in CIDR format. The Scope field can be "local" or "global", and says
+// whether the address pool is to be used for local or global scope networks.
+// Example: a Base "10.10.0.0/16 with Size 24 will define the set of 256
+// 10.10.[0-255].0/24 address pools
+type PredefinedPools struct {
+ Scope string `json:"scope,omitempty"`
+ Base string `json:"base"`
+ Size int `json:"size"`
+}
+
+// InitNetworks initializes the local and global scope predefined adress pools
+// with the default values.
func InitNetworks() {
initNetworksOnce.Do(func() {
PredefinedBroadNetworks = initBroadPredefinedNetworks()
@@ -48,3 +63,86 @@ func initGranularPredefinedNetworks() []*net.IPNet {
}
return pl
}
+
+// InitAddressPools allows to initialize the local and global scope predefined
+// address pools to the desired values. It fails is invalid input is passed
+// or if the predefined pools were already initialized.
+func InitAddressPools(list []*PredefinedPools) error {
+ var (
+ localPools []*net.IPNet
+ globalPools []*net.IPNet
+ done bool
+ )
+
+ for _, p := range list {
+ if p == nil {
+ continue
+ }
+ _, b, err := net.ParseCIDR(p.Base)
+ if err != nil {
+ return fmt.Errorf("invalid base pool %q: %v", p.Base, err)
+ }
+ ones, _ := b.Mask.Size()
+ if p.Size <= 0 || p.Size < ones {
+ return fmt.Errorf("invalid pools size: %d", p.Size)
+ }
+ switch p.Scope {
+ case "", "local":
+ localPools = append(localPools, initPools(p.Size, b)...)
+ case "global":
+ globalPools = append(globalPools, initPools(p.Size, b)...)
+ default:
+ return fmt.Errorf("invalid pool scope: %s", p.Scope)
+ }
+ }
+
+ if localPools == nil {
+ localPools = initBroadPredefinedNetworks()
+ }
+ if globalPools == nil {
+ globalPools = initGranularPredefinedNetworks()
+ }
+
+ initNetworksOnce.Do(func() {
+ PredefinedBroadNetworks = localPools
+ PredefinedGranularNetworks = globalPools
+ done = true
+ })
+
+ if !done {
+ return ErrPoolsAlreadyInitialized
+ }
+
+ return nil
+}
+
+func initPools(size int, base *net.IPNet) []*net.IPNet {
+ one, bits := base.Mask.Size()
+ mask := net.CIDRMask(size, bits)
+ n := 1 << uint(size-one)
+ s := uint(bits - size)
+ list := make([]*net.IPNet, 0, n)
+
+ for i := 0; i < n; i++ {
+ ip := copyIP(base.IP)
+ addIntToIP(ip, uint(i<<s))
+ list = append(list, &net.IPNet{IP: ip, Mask: mask})
+ }
+
+ return list
+}
+
+func copyIP(from net.IP) net.IP {
+ ip := make([]byte, len(from))
+ for i := 0; i < len(ip); i++ {
+ ip[i] = from[i]
+ }
+ return ip
+}
+
+func addIntToIP(array []byte, ordinal uint) {
+ for i := len(array) - 1; i >= 0; i-- {
+ array[i] |= (byte)(ordinal & 0xff)
+ ordinal >>= 8
+ }
+}
From 0c0c9cbb8d85554bc195c91155ad778450b7eb86 Mon Sep 17 00:00:00 2001
From: Alessandro Boch <[email protected]>
Date: Tue, 13 Dec 2016 15:04:59 -0800
Subject: [PATCH 2/2] Allow user to control the default address pools
- Via daemon flag --default-address-pools scope=<local|global>,base=<CIDR>,size=<int>
Signed-off-by: Alessandro Boch <[email protected]>
---
cmd/dockerd/config_unix.go | 1 +
daemon/config/config_unix.go | 17 +++++++
daemon/daemon_unix.go | 4 ++
docs/reference/commandline/dockerd.md | 1 +
integration-cli/docker_cli_daemon_test.go | 30 +++++++++++
man/dockerd.8.md | 6 +++
opts/address_pools.go | 84 +++++++++++++++++++++++++++++++
7 files changed, 143 insertions(+)
create mode 100644 opts/address_pools.go
diff --git a/cmd/dockerd/config_unix.go b/cmd/dockerd/config_unix.go
index d79f0b5c9ac..6c70b462f5d 100644
--- a/cmd/dockerd/config_unix.go
+++ b/cmd/dockerd/config_unix.go
@@ -46,6 +46,7 @@ func installConfigFlags(conf *config.Config, flags *pflag.FlagSet) {
flags.Int64Var(&conf.CPURealtimeRuntime, "cpu-rt-runtime", 0, "Limit the CPU real-time runtime in microseconds")
flags.StringVar(&conf.SeccompProfile, "seccomp-profile", "", "Path to seccomp profile")
flags.Var(&conf.ShmSize, "default-shm-size", "Default shm size for containers")
+ flags.Var(opts.NewPoolsOpt(&conf.NetworkConfig.DefaultAddressPools), "default-address-pools", "Set the default address pools for local/global scope networks")
attachExperimentalFlags(conf, flags)
}
diff --git a/daemon/config/config_unix.go b/daemon/config/config_unix.go
index 8f1da59198b..87a86a8c4c2 100644
--- a/daemon/config/config_unix.go
+++ b/daemon/config/config_unix.go
@@ -7,6 +7,7 @@ import (
"github.com/docker/docker/opts"
units "github.com/docker/go-units"
+ "github.com/docker/libnetwork/ipamutils"
)
// Config defines the configuration of a docker daemon.
@@ -18,6 +19,8 @@ type Config struct {
// These fields are common to all unix platforms.
CommonUnixConfig
+ NetworkConfig
+
// Fields below here are platform specific.
CgroupParent string `json:"cgroup-parent,omitempty"`
EnableSelinuxSupport bool `json:"selinux-enabled,omitempty"`
@@ -51,6 +54,12 @@ type BridgeConfig struct {
FixedCIDRv6 string `json:"fixed-cidr-v6,omitempty"`
}
+// NetworkConfig stores the daemon-wide networking configurations
+type NetworkConfig struct {
+ // Default address pools for docker networks
+ DefaultAddressPools []*ipamutils.PredefinedPools `json:"default-address-pools,omitempty"`
+}
+
// IsSwarmCompatible defines if swarm mode can be enabled in this config
func (conf *Config) IsSwarmCompatible() error {
if conf.ClusterStore != "" || conf.ClusterAdvertise != "" {
@@ -61,3 +70,11 @@ func (conf *Config) IsSwarmCompatible() error {
}
return nil
}
+
+// ProcessPoolsConfig applies the default address pools configuration, if present
+func (conf *Config) ProcessPoolsConfig() error {
+ if len(conf.NetworkConfig.DefaultAddressPools) == 0 {
+ return nil
+ }
+ return ipamutils.InitAddressPools(conf.NetworkConfig.DefaultAddressPools)
+}
diff --git a/daemon/daemon_unix.go b/daemon/daemon_unix.go
index 6a961bae01a..f1df9af3c66 100644
--- a/daemon/daemon_unix.go
+++ b/daemon/daemon_unix.go
@@ -725,6 +725,10 @@ func configureKernelSecuritySupport(config *config.Config, driverName string) er
}
func (daemon *Daemon) initNetworkController(config *config.Config, activeSandboxes map[string]interface{}) (libnetwork.NetworkController, error) {
+ if err := config.ProcessPoolsConfig(); err != nil {
+ return nil, err
+ }
+
netOptions, err := daemon.networkOptions(config, daemon.PluginStore, activeSandboxes)
if err != nil {
return nil, err
diff --git a/docs/reference/commandline/dockerd.md b/docs/reference/commandline/dockerd.md
index 92b9349a417..e05f26ee495 100644
--- a/docs/reference/commandline/dockerd.md
+++ b/docs/reference/commandline/dockerd.md
@@ -34,6 +34,7 @@ Options:
--config-file string Daemon configuration file (default "/etc/docker/daemon.json")
--containerd string Path to containerd socket
-D, --debug Enable debug mode
+ --default-address-pools value Set the default address pools for local/global scope networks
--default-gateway value Container default gateway IPv4 address
--default-gateway-v6 value Container default gateway IPv6 address
--default-runtime string Default OCI runtime for containers (default "runc")
diff --git a/integration-cli/docker_cli_daemon_test.go b/integration-cli/docker_cli_daemon_test.go
index b7f8f0a5426..2090c436a08 100644
--- a/integration-cli/docker_cli_daemon_test.go
+++ b/integration-cli/docker_cli_daemon_test.go
@@ -944,6 +944,36 @@ func (s *DockerDaemonSuite) TestDaemonLinksIpTablesRulesWhenLinkAndUnlink(c *che
s.d.Cmd("kill", "parent")
}
+func (s *DockerDaemonSuite) TestDaemonNetworkPools(c *check.C) {
+ testRequires(c, DaemonIsLinux, SameHostDaemon)
+
+ // Remove docker0 bridge and the start daemon defining the predefined address pools
+ defaultNetworkBridge := "docker0"
+ deleteInterface(c, defaultNetworkBridge)
+ s.d.Start(c,
+ "--default-address-pools", "scope=local,base=175.30.0.0/16,size=16",
+ "--default-address-pools", "scope=local,base=175.33.0.0/16,size=24")
+
+ // Verify bridge network's subnet
+ out, err := s.d.Cmd("network", "inspect", "--format", "'{{.IPAM.Config}}'", "bridge")
+ c.Assert(err, check.IsNil)
+ c.Assert(out, checker.Contains, "175.30.0.0/16")
+
+ // Create a bridge network and verify its subnet is the second default pool
+ _, err = s.d.Cmd("network", "create", "nw100")
+ c.Assert(err, check.IsNil)
+ out, err = s.d.Cmd("network", "inspect", "--format", "'{{.IPAM.Config}}'", "nw100")
+ c.Assert(err, check.IsNil)
+ c.Assert(out, checker.Contains, "175.33.0.0/24")
+
+ // Create a bridge network and verify its subnet is the third default pool
+ _, err = s.d.Cmd("network", "create", "nw101")
+ c.Assert(err, check.IsNil)
+ out, err = s.d.Cmd("network", "inspect", "--format", "'{{.IPAM.Config}}'", "nw101")
+ c.Assert(err, check.IsNil)
+ c.Assert(out, checker.Contains, "175.33.1.0/24")
+}
+
func (s *DockerDaemonSuite) TestDaemonUlimitDefaults(c *check.C) {
testRequires(c, DaemonIsLinux)
diff --git a/man/dockerd.8.md b/man/dockerd.8.md
index 56408089d4c..c573cc87bac 100644
--- a/man/dockerd.8.md
+++ b/man/dockerd.8.md
@@ -18,6 +18,7 @@ dockerd - Enable daemon mode
[**--config-file**[=*/etc/docker/daemon.json*]]
[**--containerd**[=*SOCKET-PATH*]]
[**-D**|**--debug**]
+[**--default-address-pools**[=*DEFAULT-ADDRESS-POOLS*]]
[**--default-gateway**[=*DEFAULT-GATEWAY*]]
[**--default-gateway-v6**[=*DEFAULT-GATEWAY-V6*]]
[**--default-runtime**[=*runc*]]
@@ -155,6 +156,11 @@ $ sudo dockerd --add-runtime runc=runc --add-runtime custom=/usr/local/bin/my-ru
**-D**, **--debug**=*true*|*false*
Enable debug mode. Default is false.
+**--default-address-pools**=""
+ Default address pools from which IPAM driver selects a subnet for the networks.
+ Example: scope=[local|global],base=172.30.0.0/16,size=24 will set the default
+ address pools for the selected scope networks to {172.30.[0-255].0/24}
+
**--default-gateway**=""
IPv4 address of the container default gateway; this address must be part of
the bridge subnet (which is defined by \-b or \--bip)
diff --git a/opts/address_pools.go b/opts/address_pools.go
new file mode 100644
index 00000000000..30e4a0a051a
--- /dev/null
+++ b/opts/address_pools.go
@@ -0,0 +1,84 @@
+package opts
+
+import (
+ "encoding/csv"
+ "fmt"
+ "strconv"
+ "strings"
+
+ types "github.com/docker/libnetwork/ipamutils"
+)
+
+// PoolsOpt is a Value type for parsing the default address pools definitions
+type PoolsOpt struct {
+ values *[]*types.PredefinedPools
+}
+
+// NewPoolsOpt creates a new PoolsOpt
+func NewPoolsOpt(ref *[]*types.PredefinedPools) *PoolsOpt {
+ return &PoolsOpt{values: ref}
+}
+
+// Set predefined pools
+func (p *PoolsOpt) Set(value string) error {
+ csvReader := csv.NewReader(strings.NewReader(value))
+ fields, err := csvReader.Read()
+ if err != nil {
+ return err
+ }
+
+ poolsDef := types.PredefinedPools{}
+
+ for _, field := range fields {
+ parts := strings.SplitN(field, "=", 2)
+ if len(parts) != 2 {
+ return fmt.Errorf("ninvalid field '%s' must be a key=value pair", field)
+ }
+
+ key := strings.ToLower(parts[0])
+ value := strings.ToLower(parts[1])
+
+ switch key {
+ case "scope":
+ poolsDef.Scope = value
+ case "base":
+ poolsDef.Base = value
+ case "size":
+ size, err := strconv.Atoi(value)
+ if err != nil {
+ return fmt.Errorf("invalid size value: %q (must be integer): %v", value, err)
+ }
+ poolsDef.Size = size
+ default:
+ return fmt.Errorf("unexpected key '%s' in '%s'", key, field)
+ }
+ }
+
+ *p.values = append(*p.values, &poolsDef)
+
+ return nil
+}
+
+// Type returns the type of this option
+func (p *PoolsOpt) Type() string {
+ return "default-address-pools"
+}
+
+// String returns a string repr of this option
+func (p *PoolsOpt) String() string {
+ pools := []string{}
+ for _, pool := range *p.values {
+ repr := fmt.Sprintf("%s %s %s", pool.Scope, pool.Base, pool.Size)
+ pools = append(pools, repr)
+ }
+ return strings.Join(pools, ", ")
+}
+
+// Value returns the mounts
+func (p *PoolsOpt) Value() []*types.PredefinedPools {
+ var pd []*types.PredefinedPools
+ for _, p := range *p.values {
+ pd = append(pd, p)
+ }
+ return pd
+} |
Ops sorry ! I was cleaning my old branches, I totally forgot about this old PR. Although I was able to recover the diffs from GH tree, I cannot reopen this PR, given the original branch is lost. I'll have to open a new PR, with changes in line with the comments. |
|
Hi guyz, not sure to fully understand if a new PR is going to be created to solve that problem ? Thanks a lot ! |
|
It looks like there are no volunteers :) |
|
just another person chiming in that this a) an issue for them and b) is in need of a fix. |
|
Yep I confirm that choosing the subnet pattern for automatic network container creation is a must have or should-have-been-must-have :) Just joking but if you guyz could integrate this PR, it's very disapointing and destabilizing not to have the ability to choose the pattern (and so the size) of subnet. |
|
+1 |
|
@aboch So, you are going to open a new PR? |
|
Seriously a discussion of nearly one year for such an important fix on the daemon side without any result? To force the people to set a static range for every docker network is really a poor option. |
|
@tboerger I've just moved to Swarm, lifts pretty much all limitations of this discussion with no drawbacks. |
|
+1, would love to see this as a feature |
|
@zarbis did you manage to use swarm for CI use case too? E.g. running tests, etc. It looks like swarm is cumbersome to use for that use case. |
|
@Vanuan it's just a matter of replacing |
|
@zarbis The main difference I see is that there's no alternative for |
|
@selansen Any update? |
|
Testing is underway. There are lot of combination that I need to test and
make sure I document it.
I am on top of it but its needs more time to get it right.
…On Wed, Jan 17, 2018 at 9:15 AM, John Yani ***@***.***> wrote:
@selansen <https://github.com/selansen> Any update?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#29376 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AKEKn0Oiinnrw1r7uwklFluWZMSIiar_ks5tLgCegaJpZM4LMWC1>
.
|
|
Hopefully we can get this into a release soon... Running lots of docker compose setups for CI on a large machine currently sucks pretty much. |
|
It looks like we should move discussion to #36054 |
|
The built-in default address pools should be documented in docs.docker.com - I believe one reason why all the docs pages have been overwhelmingly ranked negative is because of the annoying lack of information. It is like the docs pages are written by someone who has no idea how to write technical documentation. |
|
Thanks @bluikko for indicating this gap in documentation. Relayed that report to docker/docs#8663. |
- Description
When user creates a network without specifying a
--subnet, docker will pick a subnet for the network from the static set172.[17-31].0.0/16and192.168.[0-240].0/20for the local scope networks and from the static set10.[0-255].[0-255].0/24for the global scope networks.For different reasons, several users have asked to be able to control these defaults.
This PR brings in a change to allow users to control the default address pools at daemon boot.
As an example,
dockerd --default-address-pools scope=local,base=10.10.0.0/16,size=24would allow user to set the 256 pools 10.10.[0-255].0/24 as default for the local scope networks.
Multiple
--default-address-poolswithscope=[local|global]can be specified.To specify the option in the config json file:
- Notes
Fixes #21776
- A picture of a cute animal (not mandatory but encouraged)