Skip to content

use (sort of) happy-eyeballs for port-forwarding#5145

Merged
mikebrow merged 1 commit intocontainerd:masterfrom
aojea:happyeyeballs
Mar 26, 2021
Merged

use (sort of) happy-eyeballs for port-forwarding#5145
mikebrow merged 1 commit intocontainerd:masterfrom
aojea:happyeyeballs

Conversation

@aojea
Copy link
Copy Markdown
Contributor

@aojea aojea commented Mar 9, 2021

port-forward needs to open a connection inside the namespace to the localhost address.

localhost can resolve to both IPv4 and IPv6 addresses in dual-stack systems
but the application can be listening in one of the IP families only.

golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12. It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.

However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially disabling the Fast Fallback support.
xref golang/go#44922

Signed-off-by: Antonio Ojea [email protected]

I've tested it with this snippet

func main() {
	fmt.Println("GOOS:", runtime.GOOS)
	// Resolver
	addrs, err := net.LookupHost("localhost")
	fmt.Println("net.LookupHost addrs:", addrs, "err:", err)

	go listener("127.0.0.1", "8080")
	// go listener("[::1]", "8080")
	time.Sleep(1 * time.Second)

	// Dial
	conn, err := net.Dial("tcp", "localhost:8080")
	if err != nil {
		fmt.Println("Error connecting", err.Error())
		return
	}
	defer conn.Close()
	conn.Write([]byte("SYN"))
	fmt.Println("Connecting to: ", conn.RemoteAddr().String())
	buf := make([]byte, 1024)
	_, err = conn.Read(buf)
	if err != nil {
		fmt.Println("Error reading:", err.Error())
	}
	fmt.Println("Client Received", string(buf))
}

func listener(address, port string) {
	// open a server listening to establish the TCP connections
	listener, err := net.Listen("tcp", address+":"+port)
	if err != nil {
		panic(err)
	}
	defer listener.Close()
	listenerAddress := listener.Addr().String()
	fmt.Println("Listener: ", listenerAddress)

	for {
		conn, err := listener.Accept()
		if err != nil {
			panic(err)
		}

		buf := make([]byte, 1024)
		_, err = conn.Read(buf)
		if err != nil {
			fmt.Println("Error reading:", err.Error())
		}
		fmt.Println("Server Received", string(buf))
		conn.Write([]byte("ACK"))
		conn.Close()
	}
}

running it inside a pod

# ./resolve_localhost
GOOS: linux
net.LookupHost addrs: [127.0.0.1 ::1] err: <nil>
Listener:  127.0.0.1:8080
DEBUG: primaries [[::1]:8080] fallbacks [127.0.0.1:8080] 1
Connecting to:  127.0.0.1:8080
Server Received SYN
Client Received ACK

xref: kubernetes/kubernetes#99850

@k8s-ci-robot
Copy link
Copy Markdown

Hi @aojea. Thanks for your PR.

I'm waiting for a containerd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Mar 9, 2021

Build succeeded.

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 9, 2021

/hold
cri-o/cri-o#4639 (comment)
I have to investigate this failure in crio-o, it is likely containerd can be affected by the same problem

@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Mar 11, 2021

Build succeeded.

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 11, 2021

/retest
/hold cancel

@k8s-ci-robot
Copy link
Copy Markdown

@aojea: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

Details

In response to this:

/retest
/hold cancel

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@aojea aojea changed the title use happy-eyeballs for port-forwarding use (sort of) happy-eyeballs for port-forwarding Mar 11, 2021
@theopenlab-ci
Copy link
Copy Markdown

theopenlab-ci Bot commented Mar 11, 2021

Build succeeded.

@estesp
Copy link
Copy Markdown
Member

estesp commented Mar 11, 2021

/ok-to-test

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 11, 2021

/retest

1 similar comment
@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 15, 2021

/retest

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 18, 2021

failures seems unrelated

    default: Summarizing 2 Failures:
    default: 
    default: [Fail] [k8s.io] Security Context NamespaceOption [It] runtime should support PodPID 
    default: /root/go/src/github.com/kubernetes-sigs/cri-tools/pkg/framework/util.go:352
    default: 
    default: [Fail] [k8s.io] Security Context NamespaceOption [It] runtime should support ContainerPID 
    default: /root/go/src/github.com/kubernetes-sigs/cri-tools/pkg/framework/util.go:352
    default: 

/assign @mikebrow

@mikebrow
Copy link
Copy Markdown
Member

yes the issue with the failure was dockerhub rate limiting...

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if negative will set to 300ms..? https://github.com/golang/go/blob/master/src/net/dial.go#L163

could you link to the code / behavior you found. I'm concerned it may still be flaky and just worked for you based on local timing of your network stack..

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 19, 2021

The negative fallback disables the dualstack algorithm that spawn goroutines golang/go#44922

https://github.com/golang/go/blob/e58fb90c753ce8ac1ccd6e26035e7ec0f4f108bc/src/net/dial.go#L59-L72

This way it dials in serial to the different resolved ips, maybe we can set deadline or.timeout too 🤔

@mikebrow
Copy link
Copy Markdown
Member

mikebrow commented Mar 19, 2021

yes.. been reading the code.. seems it will try ipv6 first for the "delay" (300ms? if set to negative or zero...) then try the fallback to ipv4... so if dual and not configured correctly for ipv6 we have a 300ms delay by default trying ipv6.. I'm thinking if it does timeout with ipv6 it will do so every time unless it's just that port ip combo...

hmm... maybe some first time logic..

@mikebrow
Copy link
Copy Markdown
Member

note: however we fix it here we should reflect that over to the _windows code

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 19, 2021

yes.. been reading the code.. seems it will try ipv6 first for the "delay" (300ms? if set to negative or zero...) then try the fallback to ipv4... so if dual and not configured correctly for ipv6 we have a 300ms delay by default trying ipv6.. I'm thinking if it does timeout with ipv6 it will do so every time unless it's just that port ip combo...

I'm not reading it that way, I think that the FallbackDelay variable is only used in dialParallel, we are setting FallbackDelay to a negative number (disable dualstack) to not use dialParallel, and use dialSerial instead . In dialSerial the timeout seems defined by the Timeout, Deadline variables or the OS/Kernel values.
https://github.com/golang/go/blob/3b0d28808df261747d7561badf91498bbb5d3e3e/src/net/dial.go#L37-L46

note: however we fix it here we should reflect that over to the _windows code

I checked the windows code and is a different situation, the "happy-eyeballs" resolution in windows depends on the wincat.exe binary and IIUIC the process is forked so it doesn't have this problem we have with Linux.

// localhost is resolved to 127.0.0.1 in ipv4, and ::1 in ipv6.
// Explicitly using ipv4 IP address in here to avoid flakiness.
cmd := []string{"wincat.exe", "127.0.0.1", fmt.Sprint(port)}
err := c.execInSandbox(ctx, id, cmd, stream, stdout, stderr)

I don't have a windows environment, but seems we need to test if wincat.exe implement happy eyeballs, if affirmative there shouldn't be any problem.

I wonder if we considered flaky the behaviour but it was always golang/go#44922 the root cause.

@mikebrow
Copy link
Copy Markdown
Member

mikebrow commented Mar 22, 2021

FallbackDelay to a negative number (disable dualstack) to not use dialParallel

You are correct, thank you for the explanation! I missed the check here.. https://github.com/golang/go/blob/3b0d28808df261747d7561badf91498bbb5d3e3e/src/net/dial.go#L417-L432

and this has limitations when running inside a namespace

could you link to the limitations you are referring to..

Are serial timeouts acceptable?
Maybe we should introduce a config for setting IPv4 vs IPv6 or dual with a certain priority list and a serial timeout? Or at least document how it can/should be done.

@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 22, 2021

could you link to the limitations you are referring to..

I documented them in the issue opened in golang golang/go#44922

Short history first, this code was "dialing" out of the namespace, when we were expecting to dial inside of the namespace

Are serial timeouts acceptable?

yes, or it should be, this is part of the "IPv6 brokeness" thing, you can read more about it here https://en.wikipedia.org/wiki/IPv6_brokenness_and_DNS_whitelisting , OSs have improved a lot the network stack since then, but you depend on the network stack ...

Maybe we should introduce a config for setting IPv4 vs IPv6 or dual with a certain priority list and a serial timeout? Or at least document how it can/should be done.

we can do the DialSerial "out" of the library and default to IPv4, is not a big change and I think that you'll be more confident with this change:


conn, err := net.Dial("tcp4", fmt.Sprintf("localhost:%d", port))
if err != nil {
   conn, err := net.Dial("tcp6", fmt.Sprintf("localhost:%d", port))
   if err != nil {
           return err

@mikebrow
Copy link
Copy Markdown
Member

mikebrow commented Mar 22, 2021

we can do the DialSerial "out" of the library and default to IPv4, is not a big change and I think that you'll be more confident with this change:
conn, err := net.Dial("tcp4", fmt.Sprintf("localhost:%d", port))
if err != nil {
conn, err := net.Dial("tcp6", fmt.Sprintf("localhost:%d", port))
if err != nil {
return err

I'd be much more confident with that ^ as the new default pattern, yes. And we could have a config for swapping that to tcp6 first, so make it a list and range through it maybe. Of course I agree timeout from one to the other is necessary, but I have a concern with switching it to 6 first and having a performance issue due to timeout and not having a config to set it back to ipv4 first. We could let them use "tcp" to mean try both in parallel with the default fast timeout..? and we could have a timeout config field.

On the other issue:

this code was "dialing" out of the namespace, when we were expecting to dial inside of the namespace

Ok, so not a current issue.

golang has enabled RFC 6555 Fast Fallback (aka HappyEyeballs)
by default in 1.12.
It means that if a host resolves to both IPv6 and IPv4,
it will try to connect to any of those addresses and use the
working connection.
However, the implementation uses go routines to start both connections in parallel,
and this has limitations when running inside a namespace, so we try to the connections
serially, trying IPv4 first for keeping the same behaviour.
xref golang/go#44922

Signed-off-by: Antonio Ojea <[email protected]>
@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 22, 2021

I think that this is the most simple change, just keeps the old behaviour and falls back to the IPv6 address.
I suggest to try to keep it simple because we have to support a very narrow scenario, that is a connection to localhost, so if we have timeout problems , that will be the less of the problems for the system where this is running

Comment thread pkg/cri/server/sandbox_portforward_linux.go
@mikebrow
Copy link
Copy Markdown
Member

I think that this is the most simple change, just keeps the old behaviour and falls back to the IPv6 address.
I suggest to try to keep it simple because we have to support a very narrow scenario, that is a connection to localhost, so if we have timeout problems , that will be the less of the problems for the system where this is running

SGTM let's use the

conn, err := net.Dial("tcp4", fmt.Sprintf("localhost:%d", port))
if err != nil {
conn, err := net.Dial("tcp6", fmt.Sprintf("localhost:%d", port))
if err != nil {
return err

nit.. on the returned err let's make sure to keep a copy of the failed tcp4 attempt and return both errors. This to avoid confusion as to why we are only trying tcp6 :-)

Copy link
Copy Markdown
Member

@dims dims left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@aojea there's one nit from @mikebrow this should be ready to merge after that.

@dims dims added this to the 1.5 milestone Mar 24, 2021
@aojea
Copy link
Copy Markdown
Contributor Author

aojea commented Mar 24, 2021

nit.. on the returned err let's make sure to keep a copy of the failed tcp4 attempt and return both errors. This to avoid confusion as to why we are only trying tcp6 :-)

it keeps both errors in the implementation, I created a new error variable to not shadow the original error

	return fmt.Errorf("failed to connect to localhost:%d inside namespace %q, IPv4: %v IPv6 %v ", port, id, err, errV6)

Copy link
Copy Markdown
Member

@mikebrow mikebrow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mikebrow mikebrow merged commit 1b05b60 into containerd:master Mar 26, 2021
BenTheElder added a commit to BenTheElder/kind that referenced this pull request Mar 27, 2021
BenTheElder added a commit to BenTheElder/kind that referenced this pull request Mar 27, 2021
BenTheElder added a commit to BenTheElder/kind that referenced this pull request Jun 22, 2021
maelvls pushed a commit to maelvls/kind that referenced this pull request Jul 1, 2021
day0ops added a commit to day0ops/terraform-bootstrap-dual-ipv6-k8-clusters that referenced this pull request Sep 13, 2021
coutinhop pushed a commit to coutinhop/kind that referenced this pull request Aug 18, 2022
adobley pushed a commit to adobley/tanzu-framework that referenced this pull request Oct 26, 2022
all current versions of tanzu are using a new-enough version of
containerd (v1.5.0 or greater) that we can remove this workaround

the issue this addressed was fixed in
containerd/containerd#5145

Signed-off-by: Aidan Obley <[email protected]>
Co-authored-by: Christian Ang <[email protected]>
adobley pushed a commit to adobley/tanzu-framework that referenced this pull request Nov 1, 2022
all current versions of tanzu are using a new-enough version of
containerd (v1.5.0 or greater) that we can remove this workaround

the issue this addressed was fixed in
containerd/containerd#5145

Signed-off-by: Aidan Obley <[email protected]>
Co-authored-by: Christian Ang <[email protected]>
christianang added a commit to vmware-tanzu/tanzu-framework that referenced this pull request Nov 3, 2022
all current versions of tanzu are using a new-enough version of
containerd (v1.5.0 or greater) that we can remove this workaround

the issue this addressed was fixed in
containerd/containerd#5145

Signed-off-by: Aidan Obley <[email protected]>
Co-authored-by: Christian Ang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants