Skip to content

Comments

Align fixed-cidr-v6 with fixed-cidr, use default ULA prefix if no fixed-cidr-v6#48319

Merged
robmry merged 10 commits intomoby:masterfrom
robmry:optional_fixed-cidr-v6
Nov 25, 2024
Merged

Align fixed-cidr-v6 with fixed-cidr, use default ULA prefix if no fixed-cidr-v6#48319
robmry merged 10 commits intomoby:masterfrom
robmry:optional_fixed-cidr-v6

Conversation

@robmry
Copy link
Contributor

@robmry robmry commented Aug 11, 2024

- What I did

  • Allow an IPv6 default bridge with no fixed-cidr-v6.
  • Fix bugs in fixed-cidr's implementation.
  • Aligned the behaviour of fixed-cidr-v6 with fixed-cidr.
  • Added daemon option --bip6 (to set the default bridge's address and subnet, like --bip).

- How I did it

The existing code's intent was a bit opaque, but seemed to be:

  • Take --bip (if supplied for a docker-managed bridge, docker0), else an existing bridge address:
    • Use it to determine the Gateway address (the bridge's address), and the PreferredPool (the docker network's subnet).
  • If fixed-cidr is provided:
    • Use it as PreferredPool, if it hasn't been determined from --bip or an existing bridge address.
    • Use it as SubPool (equivalent to --ip-range for a user-defined network, it's the range of addresses the IPAM driver can allocate from. (SubPool is expected to be within PreferredPool, but that's not enforced.)

So ...

  • Added tests to capture the current behaviour of fixed-cidr.
    • Highlighted problems using FIXME comments in the tests ... mostly confusion between daemon-managed (docker0), and user-managed (--bridge) bridges, leading to surprising and incorrect behaviour.
  • Incrementally fixed the issues.
    • Refactoring so that the logic that resolves fixed-cidr/bip/existing bridge addresses into an IPAMConf can be used for fixed-cidr-v6.
    • Accepted inconsistent config where it would have done something useful before, rather failing to start the daemon. Log problems, and try to handle things approximately as they would have been before (but with consistent IPAM config where before, for example, the requested range for address allocation could end up outside the subnet).
  • Added daemon option --bip6 because it's the only way to be sure docker0's address and subnet size are updated (similar to existing -bip for IPv4 addresses).
  • Removed the requirement for a fixed-cidr-v6 when IPv6 is enabled.
  • Used the refactored logic for IPv6, including fixed-cidr-v6.

Example of confusion between daemon/user-managed default bridge network configuration:

  • Start dockerd, get a default bridge with a subnet from default-address-pool, for example 172.18.0.0/16.
  • Restart dockerd, with fixed-cidr: "10.0.0.0/24".
  • network inspect says ...
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.18.0.0/16",
                    "IPRange": "10.0.0.0/24",
                    "Gateway": "172.18.0.1"
                }
            ]
        },
  • Note:
    • The subnet hasn't changed.
    • IPRange (addresses that can be allocated from the subnet) is outside the subnet.
  • The problem here is docker0 being treated like a user-managed bridge - the bridge's existing address takes priority, even though there's no configuration saying it can't change.
  • It skips IPAM validation because it doesn't go via createNetwork ... and (surprisingly, to me) it still allocates container addresses from 172.18.0.0/16 because the IPRange is only used to calculate an offset into Subnet.
  • So, the default bridge network still worked (as long as IPRange's subnet size doesn't cause allocations outside Subnet), and the only real problem is that it's not possible to change fixed-cidr without deleting config.

Issues with existing implementation:

  • Can't change docker0's fixed-cidr without specifying --bip (the issue described in the example above).
    • Fixed by ignoring the existing address if its subnet does not overlap fixed-cidr, or by expanding the bridge's subnet if fixed-cidr encompasses the existing subnet.
    • TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_within_fixed-cidr
    • TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_outside_fixed-cidr
  • For docker0, when there's no --bip to dictate the subnet, it's derived from an existing bridge address. If fixed-cidr's subnet is made smaller following a daemon restart, a user might reasonably expect the default bridge network's subnet to shrink to match. However, that has not been the behaviour - instead, only the allocatable range is reduced (as would happen with a user-managed bridge).
    • In this case, if the user wants a smaller subnet, options are to delete docker0, or supply a --bip.
    • A change in this subtle behaviour might be best. But, it's probably not causing problems, and it'd be a breaking change if anyone is relying on it.
    • TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_within_old_bridge_subnet
    • (And TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_within_old_bridge_subnet_with_bip to show that --bip sorts things out.)
  • For a user-managed bridge, selection of the existing bridge address to infer the subnet from depended on the address falling within fixed-cidr. But, that's just the allocatable range.
    • Fixed by picking an address with a subnet that encompasses fixed-cidr, even if the address itself isn't within fixed-cidr.
    • But also keep the existing behaviour that allowed fixed-cidr, the allocatable address range, to be bigger than the subnet inferred from the bridge's addresss (then fix that, as described below).
    • Test TestDaemonDefaultBridgeIPAM_UserBr/fcidr_in_bridge_subnet_and_bridge_ip_not_in_fcidr.
  • For a user-managed bridge, if fixed-cidr did not overlap with the subnet belonging to any address on the bridge, or was bigger than a bridge address's subnet (the user-managed bridge version of the docker0 example above) - any address was selected from the bridge, and the allocatable range would just be wrong.
    • Fixed by logging a warning and ignoring fixed-cidr, making the whole subnet inferred from the bridge address allocatable. (Before, because the allocator's implementation treats the sub-pool as an offset into the pool rather than an actual address range, this could result in an allocatable range within or bigger than the pool - so, this is a change in behaviour, but the configuration is broken ... I don't think it's worth trying to preserve the range that would have happened-to-work.)
    • TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_bigger_than_bridge_subnet
    • TestDaemonDefaultBridgeIPAM_UserBr/no_bridge_ip_within_fixed-cidr
    • TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_contains_bridge_subnet
  • Code that cleared the top bits of an IP address, then checked IsGlobalUnicast().
    • The intention may have been to only accept a globally routable address as the gateway address on a user-managed default bridge. But, it didn't - so, removed.
  • IPAMConfig was unnecessarily set up for IPv6 if fixed-cidr-v6 was supplied with IPv6 was disabled.
  • IPv6 IPAMConfig ignored the prefix length of an existing bridge address, and just used fixed-cidr-v6's prefix length.
    • It's IPv6, so that prefix is only informational. But, it's the only information there is! There was no way to make the allocatable ip-range smaller than the subnet.
    • Fixed by adding --bip6 to make it explicitly configurable.
    • (This is a change in behaviour, a change in fixed-cidr-v6 now behaves like a change in fixed-cidr rather than just being an override. But, that's important for a user-managed bridge, and overridable with --bip6 for docker0).

- How to verify it

New integration tests...

Running /go/src/github.com/docker/docker/integration/daemon (arm64.integration.daemon) flags=-test.v -test.timeout=10m -test.run TestDaemonDefaultBridgeIPAM
INFO: Testing against a local daemon
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/no_config
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_only
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/bip_only
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/existing_bridge_address_only
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_within_old_bridge_subnet
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_within_old_bridge_subnet_with_new_bip
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_within_fixed-cidr
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_outside_fixed-cidr
=== RUN   TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_outside_fixed-cidr_with_bip
--- PASS: TestDaemonDefaultBridgeIPAM_Docker0 (5.71s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/no_config (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_only (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/bip_only (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/existing_bridge_address_only (0.64s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_within_old_bridge_subnet (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/fixed-cidr_within_old_bridge_subnet_with_new_bip (0.67s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_within_fixed-cidr (0.61s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_outside_fixed-cidr (0.64s)
    --- PASS: TestDaemonDefaultBridgeIPAM_Docker0/old_bridge_subnet_outside_fixed-cidr_with_bip (0.63s)
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/bridge_only
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_only
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/fcidr_in_bridge_subnet_and_bridge_ip_in_fcidr
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/fcidr_in_bridge_subnet_and_bridge_ip_not_in_fcidr
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_bigger_than_bridge_subnet
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/no_bridge_ip_within_fixed-cidr
=== RUN   TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_contains_bridge_subnet
--- PASS: TestDaemonDefaultBridgeIPAM_UserBr (4.44s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/bridge_only (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_only (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/fcidr_in_bridge_subnet_and_bridge_ip_in_fcidr (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/fcidr_in_bridge_subnet_and_bridge_ip_not_in_fcidr (0.63s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_bigger_than_bridge_subnet (0.64s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/no_bridge_ip_within_fixed-cidr (0.65s)
    --- PASS: TestDaemonDefaultBridgeIPAM_UserBr/fixed-cidr_contains_bridge_subnet (0.63s)
PASS

DONE 18 tests in 10.201s

- Description for the changelog

- Daemon option `--ipv6` (`"ipv6": true` in `daemon.json`) can now be used without
  `fixed-cidr-v6`. A subnet will be allocated `default-address-pools` or the daemon's
  ULA prefix.
- Added daemon option `--bip6=<cidr>` (`"bip6": "<cidr>") to specify the default
  bridge's IPv6 address and subnet, similar to `--bip` for IPv4.
- Resolved issues related to changing `fixed-cidr` for `docker0`, and inferring
   configuration from a user-managed default bridge (`--bridge).
- Aligned the behavior of `fixed-cidr-v6` with `fixed-cidr`.

@robmry robmry added kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny area/networking Networking area/networking/d/bridge Networking labels Aug 11, 2024
@robmry robmry self-assigned this Aug 11, 2024
@robmry robmry force-pushed the optional_fixed-cidr-v6 branch 8 times, most recently from 9c43cd2 to 2aa8461 Compare August 12, 2024 16:10
@robmry robmry added this to the 28.0.0 milestone Aug 13, 2024
@robmry robmry force-pushed the optional_fixed-cidr-v6 branch from 2aa8461 to adc3724 Compare August 13, 2024 14:36
@robmry robmry requested review from akerouanton and corhere August 13, 2024 14:38
@robmry robmry marked this pull request as ready for review August 13, 2024 14:38
@robmry robmry force-pushed the optional_fixed-cidr-v6 branch 4 times, most recently from 72dc89e to 704765d Compare September 19, 2024 11:48
if hip.IsGlobalUnicast() {
ipamV4Conf.Gateway = nw.IP.String()
}
ipamV4Conf.Gateway = nw.IP.String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should log something if this IP address isn't a 'global unicast'?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it'd be too meaningful - LL addresses have been allowed by the broken check, so they're supported now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a comment be useful to document "we know, and we allow because .."?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so ... there's not really anything special about allowing link-local addresses.

Maybe, although I'm not sure, the original idea was to disallow them. But, that would have been the special case that needed explaining.

// is wrong. But, just log rather than raise an error that would cause daemon
// startup to fail, because this has been allowed by earlier versions. Remove
// the SubPool, so that addresses are allocated from the whole of PreferredPool.
log.G(context.TODO()).WithFields(log.Fields{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we should really allow this misconfiguration to slip in. I think it's fine to log something for now, but maybe we should consider barfing out in later releases (and temporarily add an escape hatch).

Your last commit reuses the same code and so makes it possible to misconfigure IPv6 the same way. We should avoid that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a pity not to have the same behaviour for both again, but I've made it an error for IPv6 and split the testcases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we should consider "deprecating" (incorrect config); possibly warn, and in future error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A warning is logged, and it's now an error for IPv6.

I'll make it log only for IPv4, and add "this may be treated as an error in a future release" to the log message ... then we can deal with it later if we want to.

}
}

if cfg.DefaultGatewayIPv4 != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition doesn't seem related to the rest of the function. It's probably better to keep it in initBridgeDriver.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function is getDefaultBridgeIPAMConf, and this is part of the IPAM conf. In later commits, I think it makes more sense to keep it in the function rather than deal with it separately for IPv4 and IPv6.

Comment on lines 1058 to 1066
// Select v4/v6 config based on family
fixedCIDR, fixedCIDROpt := cfg.FixedCIDR, "fixed-cidr"
cfgBIP, cfgBIPOpt := cfg.IP, "bip"
cfgDefGw, cfgDefGwOpt := cfg.DefaultGatewayIPv4, "DefaultGatewayIPv4"
if family == netlink.FAMILY_V6 {
fixedCIDR, fixedCIDROpt = cfg.FixedCIDRv6, "fixed-cidr-v6"
cfgBIP, cfgBIPOpt = cfg.IP6, "bip6"
cfgDefGw, cfgDefGwOpt = cfg.DefaultGatewayIPv6, "DefaultGatewayIPv6"
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not parse these outside of this function and pass them as arguments? I'm not a big fan of these *Opt vars.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tarted it up a bit ... any better?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure we get it any better without touching how daemon settings are organized.

// on this network until after the driver has created the endpoint and returned the
// constructed address. Libnetwork will then reserve this address with the ipam driver.
ones, _ := fCIDRv6.Mask.Size()
deferIPv6Alloc = ones <= 80
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should also apply when there's no more --fixed-cidr-v6 / --bip option but the bridge already exists and has a subnet assigned.

Also, before that change there was no --bip6, but it should now be taken into account.

You can probably solve both issues by using ipamV6Conf.PreferredPool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, yes ... if the bridge doesn't exist, and there's no --fixed-cidr-v6/--bip, there won't be an ipamV6Conf.PreferredPool either.

Without asking for deferred IPv6 address allocation in that case too, container IPv6 addresses will be IPAM-allocated when the daemon first starts, then "MAC based" after restarts when the bridge has an address.

I think it's ok for this code to assume the daemon's default ULA prefix is big enough. (It's 64-bits, and I don't think it can be overridden separately, fixed-cidr-v6 is the way to do that?)

(This special case is grim - derive a MAC addresses from an IPAM-allocated IPv4 address, then use the MAC address to derive an IPv6 address - but only for containers in the default network, with bits of the implementation scattered betwen daemon/libnetwork/bridge driver. It'll be even worse when we move to using random MAC addresses with gratuitous ARPs/NAs, to deal with IPv6-only. We should think about dropping it.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, and added tests.

@robmry robmry force-pushed the optional_fixed-cidr-v6 branch from 704765d to 878949e Compare September 23, 2024 10:35
@robmry robmry force-pushed the optional_fixed-cidr-v6 branch from 878949e to ff4cfe0 Compare October 31, 2024 09:25
@robmry
Copy link
Contributor Author

robmry commented Oct 31, 2024

Rebased to resolve conflicts.

@robmry robmry mentioned this pull request Nov 4, 2024
18 tasks
@robmry robmry force-pushed the optional_fixed-cidr-v6 branch from ff4cfe0 to 8a6354d Compare November 6, 2024 12:15
@robmry
Copy link
Contributor Author

robmry commented Nov 6, 2024

Added --bip6 to man/docker.8.md in the final commit.

// outside the daemon, and it's still possible to use '--bip'. It is
// intended only for use in moby tests; it may be removed, its behaviour
// may be modified, and it may not do what you want anyway!
if bn := os.Getenv("DOCKER_TEST_CREATE_DEFAULT_BRIDGE"); bn != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, because some of this is existing but;

  1. I was confused for a second, as I thought DOCKER_TEST_CREATE_DEFAULT_BRIDGE would be a boolean, but it's to override the name to use. Perhaps a different name would be clearer (DOCKER_TEST_OVERRIDE_DEFAULT_BRIDGE_NAME or "something" indicating it's setting the name)
  2. It looks like this env-var found its way into multiple locations, which makes me wonder if it's in too many places, and if it should really be in a single place that sets "use custom name instead of the default default name".
git grep 'DOCKER_TEST_CREATE_DEFAULT_BRIDGE'
daemon/daemon_unix.go:  // Env var DOCKER_TEST_CREATE_DEFAULT_BRIDGE env var modifies the default
daemon/daemon_unix.go:  if bn := os.Getenv("DOCKER_TEST_CREATE_DEFAULT_BRIDGE"); bn != "" {
integration/daemon/daemon_linux_test.go:        d := daemon.New(t, daemon.WithEnvVars("DOCKER_TEST_CREATE_DEFAULT_BRIDGE="+bridgeName))
integration/daemon/daemon_linux_test.go:                        dOpts = append(dOpts, daemon.WithEnvVars("DOCKER_TEST_CREATE_DEFAULT_BRIDGE="+bridgeName))
libnetwork/drivers/bridge/setup_device_linux.go:        // DOCKER_TEST_CREATE_DEFAULT_BRIDGE env var. It should be used only for
libnetwork/drivers/bridge/setup_device_linux.go:        if defaultBridgeName = os.Getenv("DOCKER_TEST_CREATE_DEFAULT_BRIDGE"); defaultBridgeName == "" {

if hip.IsGlobalUnicast() {
ipamV4Conf.Gateway = nw.IP.String()
}
ipamV4Conf.Gateway = nw.IP.String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would a comment be useful to document "we know, and we allow because .."?

Comment on lines 957 to 958
_, fCIDR, err := net.ParseCIDR(cfg.FixedCIDR)
fCidrIP, fCidrNet, err := net.ParseCIDR(cfg.FixedCIDR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; I see that we have a defBrOptsV4 option specifically for handling these options, but we end up parsing the raw cfg here.

Also not for this PR probably , but wondering if parsing these options should happen earlier in the daemon's lifecycle, so that also dockerd --validate would already catch invalid values (and possibly we'd have a strong-typed value once we arrive here.

Slightly related to that, wondering if at this point we should still refer to fixed-cider (name of the flag), or FixedCIDR instead, but I guess if we don't validate before this, we don't have an error related to what the flag (or field) is named.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defBrOptsv4 is introduced and used in the commit after this one, and the option name is included in the error message. (But I'll update the error for failing to parse --bip/--bip6 to include the option name.)

Comment on lines 968 to 969
ipamV4Conf.PreferredPool = bIPNet.String()
ipamV4Conf.Gateway = bIP.String()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was about to suggest; we should look at making this a proper type, but looks like there's already a TODO for that;

// TODO(aker): use proper net/* structs instead of string literals.
type IpamConf struct {

Comment on lines 963 to 964
bIP, bIPNet = selectBIP(nwList, fCidrIP, fCidrIPNet)
// If the bridge is docker-managed (docker0) and fixed-cidr is not
// inside a subnet belonging to any existing bridge IP, ignore the
// bridge IP.
if !userManagedBridge && fCidrIP != nil && bIPNet != nil && !bIPNet.Contains(fCidrIP) {
bIP = nil
bIPNet = nil
if !userManagedBridge && fCidrIP != nil && bIPNet != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only glancing over, but wondering if the result of selectBIP is always used here or if if should be executed conditionally.

This looks to be the only location where selectBIP is used, so perhaps some of the logic should/could be combined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If selectBIP is called, and it's able to determine an address, the result is always used (in if bIP != nil {, just after this if-else). It's only called if there's no --bip/--bip6 to override the value.

I've moved the logic to fix-up the result based on current (changed) fixed-cidr/bip options for docker0 into selectBIP. (I think that's what you're suggesting, rather than inlining selectBIP?)

// is wrong. But, just log rather than raise an error that would cause daemon
// startup to fail, because this has been allowed by earlier versions. Remove
// the SubPool, so that addresses are allocated from the whole of PreferredPool.
log.G(context.TODO()).WithFields(log.Fields{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we should consider "deprecating" (incorrect config); possibly warn, and in future error?

return o.cfg.IP, "bip"
}
func (o defBrOptsV4) defGw() (gw net.IP, optName string) {
return o.cfg.DefaultGatewayIPv4, "DefaultGatewayIPv4"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is DefaultGatewayIPV4 the name that will be used in the daemon.json ? I noticed that all other options we use hyphenated names, e.g.

DefaultGatewayIPv4 net.IP `json:"default-gateway,omitempty"`
DefaultGatewayIPv6 net.IP `json:"default-gateway-v6,omitempty"`

Also curious if it would make sense to make these options in the internal/opts package (but haven't looked too closely)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is DefaultGatewayIPV4 the name that will be used in the daemon.json ? I noticed that all other options we use hyphenated name

Ah, no - it's the label used in ipamConf.AuxAddresses[] ... which is confusing. I've made defGw return both (the optName isn't used, but I think it's good to have it there anyway).

@robmry robmry force-pushed the optional_fixed-cidr-v6 branch 2 times, most recently from 6aec9c2 to e482eb9 Compare November 11, 2024 17:36
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

libnetwork.NetworkOptionDriverOpts(netOption),
libnetwork.NetworkOptionIpam("default", "", v4Conf, v6Conf, nil),
libnetwork.NetworkOptionDeferIPv6Alloc(deferIPv6Alloc))
return []*libnetwork.IpamConf{ipamConf}, nil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably fine to look for a future change, but my eye dropped on the []*libnetwork.IpamConf{} as type used;

Becuase I'm about to add tests that use netlink, and the netlink
package breaks compilation under Windows.

Signed-off-by: Rob Murray <[email protected]>
The intention may have been to only accept a globally routable
address as the gateway address on a user-supplied default bridge.
But, the test didn't do anything, so it's been allowing LL
subnets. It's too late to change that now so, remove the broken
check.

Signed-off-by: Rob Murray <[email protected]>
When a user-managed bridge is used instead of docker0 (--bridge), with
a fixed-cidr - the bridge should have an IP address/subnet that
encompasses fixed-cidr ... the bridge address's subnet then defines
the network's subnet, and fixed-cidr defines the allocatable range
within that.

But, selection of the correct subnet/address from the bridge depended
on the address being within fixed-cidr (within the allocatable range).

This change removes that assumption. So, a bridge address with a
subnet that includes fixed-cidr is selected.

Signed-off-by: Rob Murray <[email protected]>
Factor out selection of addresses from an existing bridge from
the code that uses the selected address to set up IPAM for the
default bridge.

Signed-off-by: Rob Murray <[email protected]>
When a docker-managed default bridge (docker0) already has an
address, and the fixed-cidr subnet fits within the subnet defined
by that address, the existing address should be used as the
gateway and to define the subnet.

But, when fixed-cidr is changed, no --bip is supplied, and no
existing bridge network includes fixed-cidr ... the existing
bridge address needs to be updated.

Signed-off-by: Rob Murray <[email protected]>
For a docker-managed default bridge (docker0), when no --bip is
supplied, the gateway address and subnet size can be inferred
from existing bridge addresses.

But, if fixed-cidr's subnet size is increased so that it's biggger
than the subnet of the bridge's existing address - the bridge's
subnet needs to be incresed to match. (fixed-cidr determines the
range of addresses that can be automatically allocated, and these
should not fall outside the default bridge's subnet.)

Signed-off-by: Rob Murray <[email protected]>
When a user-managed bridge is used for the default network (--bridge),
an address from the bridge determines the subnet for the network.

If a fixed-cidr is supplied, it should fall within that subnet. If it
doesn't, it's a misconfiguration - fixed-cidr is the range of
allocatable addresses, and they need to be in the network. (Either
the user's bridge is missing an address that matches their fixed-cidr
or the fixed-cidr is wrong.)

When this happens, because it's been allowed in the past (and, because
the address-pool implementation treats fixed-cidr/SubPool as an offset
into the network rather than an actual address range, so working IP
addresses would normally still be assigned to containers) ... don't
reject the config and cause daemon startup to fail. Just log a warning
and ignore fixed-cidr.

Signed-off-by: Rob Murray <[email protected]>
Use the same logic to generate IPAMConf for IPv6 as for IPv4.

- When no fixed-cidr-v6 is specified, rather than error out, use
  the default address pools (as for an IPv4 default bridge with no
  fixed-cidr, and as for user-defined networks).
- Add daemon option --bip6, similar to --bip.
  - Necessary because it's the only way to override an old address
    on docker0 (daemon-managed default bridge), as illustrated by
    test cases.
- For a user-managed default bridge (--bridge), use IPv6 addresses
  on the user's bridge to determine the pool, sub-pool and gateway.
  Following the same rules as IPv4.
- Don't set up IPv6 IPAMConf if IPv6 is not enabled.

Signed-off-by: Rob Murray <[email protected]>
@robmry robmry force-pushed the optional_fixed-cidr-v6 branch from e482eb9 to 0b5b1db Compare November 25, 2024 18:29
@robmry
Copy link
Contributor Author

robmry commented Nov 25, 2024

Rebased - the conflict was just in imports in bridge_linux_test.go ... I'll let the tests run in case anything else shows up, as it was a fair way behind master.

@robmry robmry merged commit 2f6953f into moby:master Nov 25, 2024
@robmry robmry deleted the optional_fixed-cidr-v6 branch November 26, 2024 09:10
aevesdocker pushed a commit to docker/docs that referenced this pull request Feb 20, 2025
## Description

Updates for moby 28.0 networking.

## Related issues or tickets

Series of commits ...
- Fix description of 'inhibit_ipv4'
- Not changed in moby 28.0, updated to clarify difference from (new)
IPv6-only networks.
- Updates to default bridge address config
  - moby/moby#48319
- Describe IPv6-only network config
  - moby/moby#48271
  - docker/cli#5599
- Update description of gateway modes
  - moby/moby#48594
  - moby/moby#48596
  - moby/moby#48597
- Describe gateway selection in the networking overview.
  - docker/cli#5664
- Describe gateway mode `isolated`
  - moby/moby#49262

## Reviews

<!-- Notes for reviewers here -->
<!-- List applicable reviews (optionally @tag reviewers) -->

- [ ] Technical review
- [ ] Editorial review
- [ ] Product review

---------

Signed-off-by: Rob Murray <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/networking/d/bridge Networking area/networking Networking docs/revisit kind/feature Functionality or other elements that the project doesn't currently have. Features are new and shiny

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use default ULA prefix if fixed-cidr-v6 is not specified

3 participants