Remove attachable network on swarm leave#30157
Remove attachable network on swarm leave#30157tiborvass merged 1 commit intomoby:masterfrom aboch:att
Conversation
|
Seems like it's the right thing to do... what happens to the networking in the container when this happens? Also, build failure: |
|
Thanks @cpuguy83
The network interface which was connecting the container to the attachable network is removed. |
daemon/network.go
Outdated
|
ping @mavenugo |
daemon/network.go
Outdated
There was a problem hiding this comment.
Should this use network-id instead of network-name?
There was a problem hiding this comment.
Not necessarily. This API can take either name of ID (it eventually calls FindNetwork()), it was in fact the first function which processes a network disconnect POST request. I am saying "it was" because later with the addition of swarm networks being lazily programmed FindNetwork had to be called before anyway.
Edit: ^^Looks like that FindNetwork call in the postNetworkDisconnect() has now been removed from master as well (it was there in 1.11.x branch where I was mistakenly looking at)
In this case the network is known to this node and there is no need to pass the ID.
But I agree it will spare the reader from wondering about this logic when reading the code, I will change it to ID.
Thanks.
There was a problem hiding this comment.
I recall a case where both a "bridge" and "overlay" network existed on a node (due to a race?), resulting in the service failing to start. Although "corner case", removing the network by name could potentially result in the wrong network being deleted.
There was a problem hiding this comment.
Ah, then it makes sense.
If user created that scenario, it is expected from him to handle things right when calling docker network rm ....
But in this case, being an automatic action taken by the daemon, we must delete the right network, with no user intervention.
So the right thing to do here is to pass the network ID as you suggested. Thanks for your comment.
|
LGTM @aboch curious to know how this will behave in a case where all the swarm managers leave, while the worker node (with an attachable network and containers) stays ? I guess, it will fix itself when the worker rejoins to new managers I assume. Can you pls confirm ? |
|
@mavenugo Not sure if this is the test you wanted to check. Is the worker node going into a failed state if we wait long enough and cluster provider be set to nil ? |
|
@aboch yes. I think that is reasonable and expected behavior. am good. |
daemon/network.go
Outdated
There was a problem hiding this comment.
Wondering, should we combine the two loops, and simply
for _, n := range daemon.GetNetworks() {
if !n.Info().Attachable() {
continue
}Or is there a special reason for using the intermediate networks variable?
There was a problem hiding this comment.
Thanks, it in fact looks like Getnetworks() returns a new slice of networks.
Then there is no need to create a new slice here.
Will fix it shortly. Thanks.
|
@thaJeztah Taken care of your comment, and retested it, thanks. PTAL when you get a chance. |
|
@aboch Can |
|
Thanks @tonistiigi Unless I missed it, Also, I do not think this fix will go in 1.13.x |
|
LGTM |
|
@dnephin Addressed your comment, PTAL (I am having some weird build error, so I just pushed the change to speed up the review) |
|
LGTM Thank you! |
- When the node leaves the cluster, if any user run container(s) is connected to the swarm network, then daemon needs to detach the container(s) and remove the network. Signed-off-by: Alessandro Boch <[email protected]>
container(s) is connected to the swarm (attachable) network,
then daemon needs to detach the container(s) and
remove the network.
Fixes #30084
- A picture of a cute animal (not mandatory but encouraged)