client: prevent idle connections leaking FDs#48736
Conversation
|
I'll look into adding some tests for this too. |
| transport.MaxIdleConns = 6 | ||
| transport.IdleConnTimeout = 30 * time.Second |
There was a problem hiding this comment.
Perhaps worth to add a comment linking to the issue here
There was a problem hiding this comment.
Curious; would setting these on the transport we pass to ConfigureTransport work? i.e., I don't think it overwrites properties it doesn't set? But perhaps it would make it more future-proof to consider these defaults we set (but can be overwritten by ConfigureTransport;
i.e.;
transport := &http.Transport{
MaxIdleConns: 6,
IdleConnTimeout: 30 * time.Second,
}
err := sockets.ConfigureTransport(transport, hostURL.Scheme, hostURL.Host)There was a problem hiding this comment.
Perhaps worth to add a comment linking to the issue here
Will do!
There was a problem hiding this comment.
Curious; would setting these on the transport we pass to ConfigureTransport work?
It would work, sure, I'm okay with doing that. I'd rather not bikeshed too much on this change though, I want to take a proper look at go-connections afterwards and see if we can some changes there and then we can revisit this, this was just meant as a quick port of tonis's patch to address the immediate issue.
| transport.MaxIdleConns = 6 | ||
| transport.IdleConnTimeout = 30 * time.Second |
There was a problem hiding this comment.
Do we want these configurable on client opts like WithMaxIdleCons / WithIdleConnTimeout so it would be applied to client transport like we do for WithTLSClientConfig ?
There was a problem hiding this comment.
Maybe! I'll ping people for thoughts.
Regardless though, imo this change is about having better defaults – we can discuss/introduce options separately, unless this change breaks any existing client as-is.
There was a problem hiding this comment.
ccf7889 to
5c64770
Compare
Patch from tonistiigi@af6ada9 Without this change, if a long-lived process uses the client and creates connections, these connections are not released and grow over time. We can also look into addressing this issue from the server side, but it doesn't hurt for the `client` package to have good defaults and not cause this. Signed-off-by: Laura Brehm <[email protected]>
5c64770 to
5c72a95
Compare
| transport.MaxIdleConns = 6 | ||
| transport.IdleConnTimeout = 30 * time.Second |
There was a problem hiding this comment.
These values seem fairly low. Maybe we could keep them the same as defaults in http.DefaultTransport?
var DefaultTransport RoundTripper = &Transport{
...
MaxIdleConns: 100,
IdleConnTimeout: 90 * time.Second,
...
There was a problem hiding this comment.
I feel like 100 idle connections is a lot, especially when those are all open FDs, but I'm open to changing it. Thoughts? @thaJeztah @corhere @crazy-max ?
There was a problem hiding this comment.
Maybe we don't need as much as 100, but 6 feels a bit too conservative IMO 😅 These are still subject to the IdleConnTimeout so I think we can afford to go a little bit higher than that.
There was a problem hiding this comment.
I guess for regular use in the CLI it wouldn't make a huge difference, but possibly more so if the client is used in a long-lived process.
Looking at the defaults, MaxIdleConns is unlimited; https://github.com/golang/go/blob/2ef8e41f9543478a51a0147a735e4415737de09f/src/net/http/transport.go#L205-L207
// MaxIdleConns controls the maximum number of idle (keep-alive)
// connections across all hosts. Zero means no limit.
MaxIdleConns intBut MaxIdleConnsPerHost defaults to 2 (DefaultMaxIdleConnsPerHost)
// MaxIdleConnsPerHost, if non-zero, controls the maximum idle
// (keep-alive) connections to keep per-host. If zero,
// DefaultMaxIdleConnsPerHost is used.
MaxIdleConnsPerHost int// DefaultMaxIdleConnsPerHost is the default value of [Transport]'s
// MaxIdleConnsPerHost.
const DefaultMaxIdleConnsPerHost = 2Which would mean that with the current settings, the client would have to have idle connections for 5 different hosts before this limit kicks in. Raising MaxIdleConns to 100 _without_ raising DefaultMaxIdleConnsPerHost` means that the client would need to have idle connections for 50 different hosts before it taking effect. That may be very unlikely (as hosts here would be "Engine APIs" or perhaps "registry to authenticate").
Also did a check for prior art in our codebase; here's some defaults that are set by SDKs we use;
git grep -E '(MaxIdleConns|IdleConnTimeout)[^!]*[:=]'
plugin/registry.go: IdleConnTimeout: 30 * time.Second,
vendor/cloud.google.com/go/compute/metadata/CHANGES.md:* **compute/metadata:** Set IdleConnTimeout for http.Client ([#7084](https://github.com/googleapis/google-cloud-go/issues/7084)) ([766516a](https://github.com/googleapis/google-cloud-go/commit/766516aaf3816bfb3159efeea65aa3d1d205a3e2)), refs [#5430](https://github.com/googleapis/google-cloud-go/issues/5430)
vendor/cloud.google.com/go/compute/metadata/metadata.go: IdleConnTimeout: 60 * time.Second,
vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/client.go: DefaultHTTPTransportMaxIdleConns = 100
vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/client.go: DefaultHTTPTransportMaxIdleConnsPerHost = 10
vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/client.go: DefaultHTTPTransportIdleConnTimeout = 90 * time.Second
vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/client.go: MaxIdleConns: DefaultHTTPTransportMaxIdleConns,
vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/client.go: MaxIdleConnsPerHost: DefaultHTTPTransportMaxIdleConnsPerHost,
vendor/github.com/aws/aws-sdk-go-v2/aws/transport/http/client.go: IdleConnTimeout: DefaultHTTPTransportIdleConnTimeout,
vendor/github.com/moby/buildkit/util/resolver/resolver.go: MaxIdleConns: 30,
vendor/github.com/moby/buildkit/util/resolver/resolver.go: IdleConnTimeout: 120 * time.Second,
vendor/github.com/moby/buildkit/util/resolver/resolver.go: MaxIdleConnsPerHost: 4,
vendor/go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp/client.go: MaxIdleConns: 100,
vendor/go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp/client.go: IdleConnTimeout: 90 * time.Second,
vendor/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp/client.go: MaxIdleConns: 100,
vendor/go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp/client.go: IdleConnTimeout: 90 * time.Second,
vendor/google.golang.org/api/internal/creds.go: MaxIdleConns: 100,
vendor/google.golang.org/api/internal/creds.go: MaxIdleConnsPerHost: 100,
vendor/google.golang.org/api/internal/creds.go: IdleConnTimeout: 90 * time.Second,
vendor/google.golang.org/api/transport/http/dial.go: trans.MaxIdleConnsPerHost = 100
vendor/google.golang.org/api/transport/http/dial.go: MaxIdleConns: 100,
vendor/google.golang.org/api/transport/http/dial.go: MaxIdleConnsPerHost: 100,
vendor/google.golang.org/api/transport/http/dial.go: IdleConnTimeout: 90 * time.Second,There was a problem hiding this comment.
Hmm right, I missed that MaxIdleConnsPerHost defaults to 2.
Patch from tonistiigi@af6ada9
- What I did
Without this change, if a long-lived process is using the client, idle connections are not released and grow over time.
We can also look into addressing this issue from the server side, but it doesn't hurt for the
clientpackage to have good defaults and not cause this.- How I did it
Set
MaxIdleConnsandIdleConnTimeouton thedefaultHTTPClient.- How to verify it
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)