Description
github.com/vishvananda/netlink was updated to v1.3.0 in #46982, but resulted in flakiness in CI;
Error initializing network controller: list bridge addresses failed: interrupted system call
Upon first look, it was suggested that this was due to a missing condition for handling EINTR; #46982 (comment)
EINTR on netlink sockets is a new one. I suspect it has more to do with the netlink dependency bump you pulled in when rebasing than on the Go toolchain bump. I think the bug is here: https://github.com/vishvananda/netlink/blob/92645823f36c7ed03faf4baa566078d9d5e06fda/nl/nl_linux.go#L821-L824 It retries on EWOULDBLOCK (a.k.a. EAGAIN) but neglects to retry on EINTR .
However, it may be because of our use of SetSocketTimeout ; see vishvananda/netlink#793 (comment)
IO calls on non-blocking sockets will never return -EINTR. The problem here is that Moby calls SetSocketTimeout, which sets SO_SNDTIMEO and SO_RCVTIMEO. These socket options are only useful for sockets in blocking mode. Setting these probably places the socket back into blocking mode.
|
// Init initializes a new network namespace |
|
func Init() { |
|
var err error |
|
initNs, err = netns.Get() |
|
if err != nil { |
|
log.G(context.TODO()).Errorf("could not get initial namespace: %v", err) |
|
} |
|
initNl, err = netlink.NewHandle(getSupportedNlFamilies()...) |
|
if err != nil { |
|
log.G(context.TODO()).Errorf("could not create netlink handle on initial namespace: %v", err) |
|
} |
|
err = initNl.SetSocketTimeout(NetlinkSocketsTimeout) |
|
if err != nil { |
|
log.G(context.TODO()).Warnf("Failed to set the timeout on the default netlink handle sockets: %v", err) |
|
} |
|
} |
https://github.com/vishvananda/netlink/blob/92645823f36c7ed03faf4baa566078d9d5e06fda/nl/nl_linux.go#L848-L860
I'm not very familiar with this project, but it seems to me that before this PR, only blocking IO is used. This PR adds calls to set the socket as non-blocking, but still allows setting the timeout socket options.
Those SetSocketTimeout uses were added in moby/libnetwork@f459afb
to address
Also related:
Description
github.com/vishvananda/netlink was updated to v1.3.0 in #46982, but resulted in flakiness in CI;
Upon first look, it was suggested that this was due to a missing condition for handling
EINTR; #46982 (comment)However, it may be because of our use of
SetSocketTimeout; see vishvananda/netlink#793 (comment)Those
SetSocketTimeoutuses were added in moby/libnetwork@f459afbto address
Also related: