api: add GwPriority field to EndpointSettings#48936
Conversation
635e18c to
6160568
Compare
| * `POST /networks/{id}/connect` and `POST /containers/create` now accept a | ||
| `Priority` field in `EndpointsConfig`. This value is used to determine which | ||
| network endpoint provides the default gateway for the container. The endpoint | ||
| with the highest priority is selected. If multiple endpoints have the same | ||
| priority, endpoints are sorted lexicographically by their network name, and | ||
| the one that sorts first is picked. |
There was a problem hiding this comment.
Silly question (ISTR @robmry also mentioned something along those lines); if the intent of this option is to determine the default gateway, would it make sense to instead consider adding a new option to define which network should be used for this?
Basically considering that the order itself was a workaround for not being able to configure the default; it seems like now we're improving the workaround, but perhaps we could implement the actual thing that it's used for somehow?
There was a problem hiding this comment.
Also curious what the behavior is when connecting to a new network with a higher priority; will it update the existing config to reconfigure the default gateway, or does this only happen when the container is initially created?
There was a problem hiding this comment.
if the intent of this option is to determine the default gateway
That's what it is.
would it make sense to instead consider adding a new option to define which network should be used for this?
Such a new option would have a major drawback: since you can't update a container, your container would have no gateway and loose access to the outside world when you'd disconnect it from the network providing the default gateway.
Another drawback: you couldn't use docker network connect to attach a network that should become the default gateway as the parameter you're ideating would be attached to the container and would need to be valid (ie. reference an existing endpoint) when the container is started.
Also curious what the behavior is when connecting to a new network with a higher priority; will it update the existing config to reconfigure the default gateway
Yup, that's what's happening -- libnetwork already does that automatically when the gateway changes for any reason (eg. because the container was not connected to a non-internal network, or because the network getting connected sorts before in lexicographic order).
This gives the ability to end-users to choose which network should be the default gateway without worrying about the order of calls made to the NetworkConnect endpoint.
On NetworkDisconnect the endpoint with the highest priority or sorting first in lexicographic order will be used.
Basically considering that the order itself was a workaround for not being able to configure the default
You mean, the lexicographic sorting? It's not going away, but with this change it's neither the only nor the recommended way to set the default gateway.
There was a problem hiding this comment.
Yeah, so considering that (and this would already be the current situation); in the following case, project_network would become the default gateway; this would (e.g.) be the project's default network used in a compose-stack;
docker network connect project_network containerBut if the container now is later connected to global_network, that one (sorting before project_network) would now become the default gateway; this can be a common scenario for users using (e.g.) traefik or nginx-proxy as proxy to map domains;
docker network connect global_network containerSo to counteract that, I would need to disconnect from the project_network, find a value that's higher than any of the existing networks it's connected to, and re-connect to that network. While a use this network as default gateway option is not ideal, it may give a slightly stronger indication "this is what to use";
- if no connected network as
use this network as default gatewayoption is set, we would still use the current (sort networks by name) - if a network has the option set though, then we use that as the default (priority over any other ordering)
- ❓ we could disallow connecting if there's already a network connected that's configured as default gateway
- if that network would be disconnected, we'd probably have to fall back to the "sort by name" approach; I could see use-cases where disconnecting would mean "no gateway", but that would be trickier to implement probably.
There was a problem hiding this comment.
A priority value seems better to me - because it's clear what will happen when the current gateway network is disconnected, or a new/desired-gateway network is connected to a running container.
If a specific network should always be the gateway, only that network will need a priority value. In that case, setting "priority=1" is equivalent to saying "this is the gateway" - no more or less confusing, but the corner cases or wanting to switch gateway networks are easier to deal with. (A fixed choice of gateway network probably the common case though - after not having a choice of gateway networks, or it not mattering which network is used.)
In the global/project_network example ... if the user wants to keep project_network as the gateway, they can either give it a high priority when creating the container, then not specify a priority for other networks. Or, specify a negative priority (I think, maybe worth a testcase?) when connecting global_network. If they want global_network to take over, they just use a higher priority.
There was a problem hiding this comment.
If I want a dual stack network acting as the IPv4 gateway, and an IPv6 only network acting as the v6 gateway ...
At the moment, even with the new priority field, the dual stack network will be the gateway for both. But, with priority plumbed in we could/should change that, and pick the v4 and v6 endpoints separately.
Then, if I give the v6 network higher priority than the dual stack, and the dual stack network higher priority than any other networks - it'll do what I want.
We can make that change in a separate PR - but I think it strengthens the case for priority values. Without them, we'd need separate options for selecting v4 and v6 gateways.
There was a problem hiding this comment.
FWIW I added a negative priority to TestConnectWithPriority, and it works as expected (the endpoint with the default priority stays the default gateway).
There was a problem hiding this comment.
We can make that change in a separate PR - but I think it strengthens the case for priority values. Without them, we'd need separate options for selecting v4 and v6 gateways.
Although ... if I have two dual-stack networks, and want one to be the IPv4 gateway and the other the IPv6 gateway - we can't describe that with a priority field. It'd have to be done with two priority values, or two "this is the gateway" flags. So, that doesn't help!
There was a problem hiding this comment.
Although ... if I have two dual-stack networks, and want one to be the IPv4 gateway and the other the IPv6 gateway - we can't describe that with a priority field. It'd have to be done with two priority values, or two "this is the gateway" flags. So, that doesn't help!
You can still split your network into a v4-only, and a v6-only network for that. I think we shouldn't go too far trying to handle edge-cases for an option that's not going to be used by the masses anyway.
robmry
left a comment
There was a problem hiding this comment.
Did you consider a "driver option" way of setting the gateway priority, so that it can be set using an old (current) version of Compose that doesn't know about the new top-level field?
Something like com.docker.network.endpoint.gwpriority ... which we have for macaddress, exposedports, dnsservers, sysctls.
I'm not sure it's needed, but maybe worth ruling out.
api/swagger.yaml
Outdated
| example: | ||
| com.example.some-label: "some-value" | ||
| com.example.some-other-label: "some-other-value" | ||
| Priority: |
There was a problem hiding this comment.
I wonder if we should call it GatewayPriority or GwPriority, to make it clear what it controls - specifically, that it doesn't affect which order the endpoints are connected in (and therefore how network interfaces are numbered in the container)... but that's what people might expect because of Compose's Priority field. We might want to add a JoinPriority or something like that as well?
There was a problem hiding this comment.
We might want to add a JoinPriority or something like that as well?
I don't think that's necessary. Users who care about interface names should rather explicitly set the interface name (I have a wip branch for that). A JoinPriority would implicitly define the interface name, and it wouldn't guarantee that ifnames are stable across restarts (eg. if NetworkConnect is called after the initial ContainerStart with a priority higher than existing endpoints).
There was a problem hiding this comment.
(and +1 on renaming this field into something more explicit)
| for _, epConfig := range networkingConfig.EndpointsConfig { | ||
| // Before 1.48, all endpoints had the same priority, so | ||
| // reinitialize this field. | ||
| epConfig.Priority = 0 |
There was a problem hiding this comment.
Maybe this should be an error, so we don't get bugs about the feature not working if someone forgot to update their client's API version number?
There's no good reason for the field to be present in a request from an older client, and the default is the old behaviour - so it's not like some of the other options where we need to set config to mimic old behaviour. (If there is a Priority field in an old-API request, returning an error would be a breaking change. I'm not sure how strict we need to be about that.)
It's like VolumeOptions.Subpath, which causes an error rather than getting ignored. But, it's also like Healthcheck.StartInterval and we zero that if it's set in an old-API request. So, it's a bit mixed!
There was a problem hiding this comment.
I'm not sure about VolumeOptions.Subpath, but in general I think we prefer resetting a field that shouldn't be usable with a given API version. This ensures that a new field is handled the same way on new and old engines.
We discussed this on the networking maintainers call ... and decided it isn't needed. Without it, the priority field won't be usable without compose/cli updates. So, we'll just have to get those done (which hopefully we'd be able to anyway). |
6160568 to
bfc074d
Compare
Signed-off-by: Albin Kerouanton <[email protected]>
This new field is used by libnetwork to determine which endpoint provides the default gateway for a container. Signed-off-by: Albin Kerouanton <[email protected]>
bfc074d to
5b752fa
Compare
|
|
||
| var joinOptions []libnetwork.EndpointOption | ||
| joinOptions := []libnetwork.EndpointOption{ | ||
| libnetwork.JoinOptionPriority(epConfig.GwPriority), |
There was a problem hiding this comment.
The hardest part reviewing this PR was to check "how does the sorting work?", and OHBOY things are libnetwork confusing; so
- the GwPriority is set on the endpoint config
- but set on the sandbox that it belongs to, indexed by endpoint-ID
Line 1247 in 71b93a8
So when does sorting happen?
This happens when an endpoint is added to a sandbox, it is injected in the right place in the list of endpoints attached to the sandbox
Lines 317 to 320 in 71b93a8
☝️ this is relevant, because it means that endpoints MUST go through that function to be added to the sandbox (just appending to the list won't put them in the right order)
But (prepare for a bumpy road here) 👇
The ordering happens by;
- each endpoint to get the sandbox it belongs to
- ❓ as they're both either already connected, or to be connected to the sandbox, wouldn't that sandbox always be the same???
- 👉 even more fun; to get the sandbox;
- the endpoint gets the network it belongs to
- from which it gets the controller the network belongs to
- which has a map of sandbox-ID -> sandbox
- which is used to get the sandbox (which ... may be the sandbox we're coming from in the first place?)
Lines 1086 to 1097 in 71b93a8
After we have the sandbox, we can look up what priority was stored for the endpoint .. for that sandbox .. so that we can (if set) sort;
Lines 664 to 669 in 71b93a8
After which we do other things again, like (AGAIN) getting the network that the endpoint belongs to, to see if that network "internal" (which is also taken into account for priority);
Lines 681 to 683 in 71b93a8
Fixes:
Supersedes:
- What I did
1st commit is just a minor refactoring.
2nd commit adds a new
GwPriorityfield toEndpointSettings. It can be specified both during initialContainerCreateor during subsequentNetworkConnect. This field is passed as is to libnetwork. Libnet already supports this property -- it'll use it to determine which endpoint should provide the default gateway for a container. If two endpoints have the same priority, they're sorted in lexicographic order, and the one sorted first is picked.- How I did it
- How to verify it
- Description for the changelog
- Add a way to specify which network should provide the default gateway for a container.