use (sort of) happy-eyeballs for port-forwarding#5145
use (sort of) happy-eyeballs for port-forwarding#5145mikebrow merged 1 commit intocontainerd:masterfrom
Conversation
|
Hi @aojea. Thanks for your PR. I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Build succeeded.
|
|
/hold |
|
Build succeeded.
|
|
/retest |
|
@aojea: Cannot trigger testing until a trusted user reviews the PR and leaves an DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
Build succeeded.
|
|
/ok-to-test |
|
/retest |
1 similar comment
|
/retest |
|
failures seems unrelated /assign @mikebrow |
|
yes the issue with the failure was dockerhub rate limiting... |
There was a problem hiding this comment.
if negative will set to 300ms..? https://github.com/golang/go/blob/master/src/net/dial.go#L163
could you link to the code / behavior you found. I'm concerned it may still be flaky and just worked for you based on local timing of your network stack..
|
The negative fallback disables the dualstack algorithm that spawn goroutines golang/go#44922 This way it dials in serial to the different resolved ips, maybe we can set deadline or.timeout too 🤔 |
|
yes.. been reading the code.. seems it will try ipv6 first for the "delay" (300ms? if set to negative or zero...) then try the fallback to ipv4... so if dual and not configured correctly for ipv6 we have a 300ms delay by default trying ipv6.. I'm thinking if it does timeout with ipv6 it will do so every time unless it's just that port ip combo... hmm... maybe some first time logic.. |
|
note: however we fix it here we should reflect that over to the _windows code |
I'm not reading it that way, I think that the
I checked the windows code and is a different situation, the "happy-eyeballs" resolution in windows depends on the wincat.exe binary and IIUIC the process is forked so it doesn't have this problem we have with Linux. containerd/pkg/cri/server/sandbox_portforward_windows.go Lines 38 to 41 in f7bd43c I don't have a windows environment, but seems we need to test if wincat.exe implement happy eyeballs, if affirmative there shouldn't be any problem. I wonder if we considered flaky the behaviour but it was always golang/go#44922 the root cause. |
You are correct, thank you for the explanation! I missed the check here.. https://github.com/golang/go/blob/3b0d28808df261747d7561badf91498bbb5d3e3e/src/net/dial.go#L417-L432
could you link to the limitations you are referring to.. Are serial timeouts acceptable? |
I documented them in the issue opened in golang golang/go#44922 Short history first, this code was "dialing" out of the namespace, when we were expecting to dial inside of the namespace
yes, or it should be, this is part of the "IPv6 brokeness" thing, you can read more about it here https://en.wikipedia.org/wiki/IPv6_brokenness_and_DNS_whitelisting , OSs have improved a lot the network stack since then, but you depend on the network stack ...
we can do the DialSerial "out" of the library and default to IPv4, is not a big change and I think that you'll be more confident with this change: |
I'd be much more confident with that ^ as the new default pattern, yes. And we could have a config for swapping that to tcp6 first, so make it a list and range through it maybe. Of course I agree timeout from one to the other is necessary, but I have a concern with switching it to 6 first and having a performance issue due to timeout and not having a config to set it back to ipv4 first. We could let them use "tcp" to mean try both in parallel with the default fast timeout..? and we could have a timeout config field. On the other issue:
Ok, so not a current issue. |
golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs) by default in 1.12. It means that if a host resolves to both IPv6 and IPv4, it will try to connect to any of those addresses and use the working connection. However, the implementation uses go routines to start both connections in parallel, and this has limitations when running inside a namespace, so we try to the connections serially, trying IPv4 first for keeping the same behaviour. xref golang/go#44922 Signed-off-by: Antonio Ojea <[email protected]>
|
I think that this is the most simple change, just keeps the old behaviour and falls back to the IPv6 address. |
SGTM let's use the
nit.. on the returned err let's make sure to keep a copy of the failed tcp4 attempt and return both errors. This to avoid confusion as to why we are only trying tcp6 :-) |
it keeps both errors in the implementation, I created a new error variable to not shadow the original error |
all current versions of tanzu are using a new-enough version of containerd (v1.5.0 or greater) that we can remove this workaround the issue this addressed was fixed in containerd/containerd#5145 Signed-off-by: Aidan Obley <[email protected]> Co-authored-by: Christian Ang <[email protected]>
all current versions of tanzu are using a new-enough version of containerd (v1.5.0 or greater) that we can remove this workaround the issue this addressed was fixed in containerd/containerd#5145 Signed-off-by: Aidan Obley <[email protected]> Co-authored-by: Christian Ang <[email protected]>
all current versions of tanzu are using a new-enough version of containerd (v1.5.0 or greater) that we can remove this workaround the issue this addressed was fixed in containerd/containerd#5145 Signed-off-by: Aidan Obley <[email protected]> Co-authored-by: Christian Ang <[email protected]>
port-forward needs to open a connection inside the namespace to the localhost address.
localhost can resolve to both IPv4 and IPv6 addresses in dual-stack systems
but the application can be listening in one of the IP families only.
golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12. It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.
However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially disabling the Fast Fallback support.
xref golang/go#44922
Signed-off-by: Antonio Ojea [email protected]
I've tested it with this snippet
running it inside a pod
xref: kubernetes/kubernetes#99850