Skip to content

Vendoring libnetwork @6caf9022fa09#26844

Merged
cpuguy83 merged 2 commits intomoby:masterfrom
mrjana:net
Sep 23, 2016
Merged

Vendoring libnetwork @6caf9022fa09#26844
cpuguy83 merged 2 commits intomoby:masterfrom
mrjana:net

Conversation

@icecrime
Copy link
Contributor

That's a lot of "Fixes #xxxx" 😉

LGTM if green!

@LK4D4
Copy link
Contributor

LK4D4 commented Sep 23, 2016

nice

@LK4D4
Copy link
Contributor

LK4D4 commented Sep 23, 2016

@mrjana

03:42:17 vendor/src/github.com/docker/libnetwork/drivers/windows/windows.go:485: endpointStruct.EnableInternalDNS undefined (type *hcsshim.HNSEndpoint has no field or method EnableInternalDNS)

on janky

@mavenugo
Copy link
Contributor

mavenugo commented Sep 23, 2016

I think we should wait for #25987 to be merged and then rebase this PR to get rid of the build error.

@thaJeztah
Copy link
Member

@mrjana @mavenugo I see that also contained fixes for #25848, #25608, and #26075 as well; two of those are on the 1.12.2 milestone (perhaps the first one should be as well?). Can we make sure that the required patches for those get in the 1.12.2 branch?

@mavenugo
Copy link
Contributor

mavenugo commented Sep 23, 2016

@thaJeztah yes. the patches for the 3 mentioned issues came in via the vendoring done for #25987 . We are cherry-picking all the relevant 1.12.2 targeted patches into libnetwork/v08 branch. The v0.8 branch will be vendored into the 1.12.x bump branch.

@thaJeztah
Copy link
Member

@mavenugo thanks for confirming (sorry for the noise, just making sure we're not missing any) ❤️

I'm adding #25848 back to the 1.12.2 milestone

@mavenugo
Copy link
Contributor

mavenugo commented Sep 23, 2016

@mrjana validating the fix for #25372 I observed the error message :
ERRO[0097] Error in agentInit : failed to create memberlist: Failed to start TCP listener. Err: listen tcp 10.0.2.15:7946: bind: address already in use

reproduction step -

$ sudo docker swarm init --advertise-addr=eth1
$ sudo docker swarm leave --force
$ sudo docker swarm init --advertise-addr=eth1 --listen-addr=eth0

I think the swarm leave is not resulting in memberlist leave.

Edit :

Could be related to something in my setup. I see this error during leave and hence memberlist shutdown is not being called in networkDB.clusterLeave

ERRO[0980] Could not close DB Ubuntu-vm-0b36f759e0a8: failed to send node leave: timed out broadcasting node event

@mavenugo
Copy link
Contributor

@mrjana Yes. this seems to be setup related. and it seems to work just fine in another VM.


var filterChan = make(chan struct{}, 1)

func filterWait() func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole thing looks really scary and weird.

Can you please explain this? Why are we doing it?
We are only ever calling the function returned by filterWait immediately after getting it. How is this safe for concurrency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a pattern to limit only one go routine to execute in that function at any point in time. The pattern works like this: When you call filterWait you try to write to a buffered channel of size 1. If nobody has written anything there the write will go through and you are allowed to execute the function. And filterWait returns a closure which will read from that channel which you call in defer i.e you read from the channel what you wrote when you are done. The channel buffer size determines how much concurrency you want. In this case you want only one single go routine to concurrently execute the caller function. You can use the same pattern to limit the concurrency to any number you want.

In this particular case it is to serialize the iptable plumbing function to be executed by only one go routine at any time to fix any race when multiple go routines try to execute these functions concurrently.

In general, I try to avoid using mutex for code serialization as it more often than not results in deadlocks. I use mutexes only to protect data not code which is exactly how they should be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it has the same chance of a dealock and race as a mutex because there is no way to try or cancel. This is basically non-obvious, slow mutex.
However, it's probably up to @mrjana to decide what to use in libnetwork :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

@cpuguy83
Copy link
Member

LGTM

@cpuguy83 cpuguy83 merged commit 2c41c02 into moby:master Sep 23, 2016
@mrjana mrjana added this to the 1.12.2 milestone Sep 23, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants