Client
I believe this affects everything, but am primarily testing with bigtable and have personally verified that spanner is also impacted
Environment
Kubernetes, including GKE
Go Environment
$ go version
go version go1.20.3 linux/amd64
$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN="/workspace/gobin"
GOCACHE="/builder/home/.cache/go-build"
GOENV="/builder/home/.config/go/env"
GOEXE=""
GOEXPERIMENT=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/go/pkg/mod"
GONOPROXY="github.com/lytics"
GONOSUMDB="github.com/lytics"
GOOS="linux"
GOPATH="/go"
GOPRIVATE="github.com/<redacted>"
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.20.3"
GCCGO="gccgo"
GOAMD64="v1"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="0"
GOMOD="/workspace/go.mod"
GOWORK=""
CGO_CFLAGS="-O2 -g"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-O2 -g"
CGO_FFLAGS="-O2 -g"
CGO_LDFLAGS="-O2 -g"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build2036783142=/tmp/go-build -gno-record-gcc-switches"
Code
package main
import (
"context"
"fmt"
"log"
"time"
"cloud.google.com/go/bigtable"
)
func main() {
client, err := bigtable.NewClient(context.Background(), "project", "instance")
if err != nil {
log.Fatalf("creating client: %v", err)
}
tbl := client.Open("table")
for {
_, err := tbl.SampleRowKeys(ctx)
if err != nil {
log.Fatalf("sampling: %v", err)
}
fmt.Println("sampled")
time.Sleep(10 * time.Second)
}
}
Expected behavior
A couple DNS requests are made to initiate connections
Actual behavior
70 extra DNS requests are made, and refreshed regularly. This number gets larger if option.WithGRPCConnectionPool is higher than the default of four
Screenshots
CPU usage node-local-dns daemonset before and after the mitigations described below were put in place just before Apr 20

Additional context
In GKE with the default DNS policy (ndots 5, five search suffixes, four connections), when the code above initializes it causes 75 requests to DNS, only five of which are useful, and most of those are duplicated
Counts of lookups during initialization, from tcdumping port 53 in a sidecar container:
4 SRV? _grpclb._tcp.bigtable.googleapis.com.svc.cluster.local.
4 SRV? _grpclb._tcp.bigtable.googleapis.com.google.internal.
4 SRV? _grpclb._tcp.bigtable.googleapis.com.default.svc.cluster.local.
4 SRV? _grpclb._tcp.bigtable.googleapis.com.c.lyticsstaging.internal.
4 SRV? _grpclb._tcp.bigtable.googleapis.com.cluster.local.
4 SRV? _grpclb._tcp.bigtable.googleapis.com.
4 A? bigtable.googleapis.com.svc.cluster.local.
4 A? bigtable.googleapis.com.google.internal.
4 A? bigtable.googleapis.com.default.svc.cluster.local.
4 A? bigtable.googleapis.com.c.lyticsstaging.internal.
4 A? bigtable.googleapis.com.cluster.local.
4 A? bigtable.googleapis.com. // this is 4x larger than it should be
4 AAAA? bigtable.googleapis.com.svc.cluster.local.
4 AAAA? bigtable.googleapis.com.google.internal.
4 AAAA? bigtable.googleapis.com.default.svc.cluster.local.
4 AAAA? bigtable.googleapis.com.c.lyticsstaging.internal.
4 AAAA? bigtable.googleapis.com.cluster.local.
4 AAAA? bigtable.googleapis.com. // this is 4x larger than it should be
1 PTR? 10.16.0.10.in-addr.arpa. // this is fine
1 A? metadata.google.internal. // this is fine
1 AAAA? metadata.google.internal. // this is fine
All SRV lookups, at least for bigtable result in NXDomains 100% of the time (and as such are only cached by node-local-dns for two seconds)
All DNS requests to .local or .internal domains are pointless and caused by the default ndots:5 and search suffix list, and also cause NXDomains
There's 4x more valid DNS requests than neccessary due to the default pool size - Increasing the pool size from four also increases the DNS requests
NXDomains are not generally cached heavily, so for large pool sizes and large numbers of pods, the traffic generated can be immense, causing timeouts talking to kube-dns, or high cpu usage in node-local-dns
These queries are also repeated once per hour by default, though as mentioned below we were seeing it be /much/ more aggressive
In our production environment, this was causing host-local-dns to have high CPU usage (20-50 cores just for node-local-dns) and log millions of timeouts talking to kube-dns in the form of [ERROR] plugin/errors: 2 <hostname> A: read tcp <node IP: port>-><kubedns IP>:53: i/o timeout as documented here, but happening with modern GKE.
The factors at play appear to be:
- The local resolver cache isn't shared between connections, causing the multiplication based on the size of your connection pools (4 by default, but ours in production are much larger than that)
- grpclb features are not disabled, causing SRV lookups that aren't ever going to return anything
- The hostname of the APIs for some or all of the endpoints lack trailing periods to make them fully qualified, causing them to have the search suffixes appended with kubernetes' default ndots config of 5
- Additionally, it looks like golang's LookupSRV tries the search suffixes after receiving an NXDomain even if ndots is set to something sane like 1, so mitigating this outside of code changes is difficult
As for mitigation, you can
- Enable ndots:1, which drops the non-SRV search suffix lookups
- Enable the undocumented GOOGLE_CLOUD_DISABLE_DIRECT_PATH environmental variable, which causes a different resolver to be used that doesn't try grpclb lookups - This seems like a hack, though
With both of these set, DNS requests are still 2 * option.WithGRPCConnectionPool due to the A and AAAA records each being queried per connection
With our large connection pools in our production environment, we also saw constant re-resolution of all these names on a rather rapid cadence - causing up to hundreds or thousands of DNS requests per second per pod, indefinitely.
Proposed changes to solve this properly:
- Share a global resolver cache process-wide so DNS lookups happen once per process, not once per connection
- Disable grpclb features for services that do not implement them
- Change hostnames to fully qualified with a trailing dot to avoid search suffix amplification
Please let me know if a more thorough example of reproduction and measurement is desired
Client
I believe this affects everything, but am primarily testing with bigtable and have personally verified that spanner is also impacted
Environment
Kubernetes, including GKE
Go Environment
Code
Expected behavior
A couple DNS requests are made to initiate connections
Actual behavior
70 extra DNS requests are made, and refreshed regularly. This number gets larger if option.WithGRPCConnectionPool is higher than the default of four
Screenshots
CPU usage node-local-dns daemonset before and after the mitigations described below were put in place just before Apr 20
Additional context
In GKE with the default DNS policy (ndots 5, five search suffixes, four connections), when the code above initializes it causes 75 requests to DNS, only five of which are useful, and most of those are duplicated
Counts of lookups during initialization, from tcdumping port 53 in a sidecar container:
All SRV lookups, at least for bigtable result in NXDomains 100% of the time (and as such are only cached by node-local-dns for two seconds)
All DNS requests to .local or .internal domains are pointless and caused by the default ndots:5 and search suffix list, and also cause NXDomains
There's 4x more valid DNS requests than neccessary due to the default pool size - Increasing the pool size from four also increases the DNS requests
NXDomains are not generally cached heavily, so for large pool sizes and large numbers of pods, the traffic generated can be immense, causing timeouts talking to kube-dns, or high cpu usage in node-local-dns
These queries are also repeated once per hour by default, though as mentioned below we were seeing it be /much/ more aggressive
In our production environment, this was causing host-local-dns to have high CPU usage (20-50 cores just for node-local-dns) and log millions of timeouts talking to kube-dns in the form of
[ERROR] plugin/errors: 2 <hostname> A: read tcp <node IP: port>-><kubedns IP>:53: i/o timeoutas documented here, but happening with modern GKE.The factors at play appear to be:
As for mitigation, you can
With both of these set, DNS requests are still
2 * option.WithGRPCConnectionPooldue to the A and AAAA records each being queried per connectionWith our large connection pools in our production environment, we also saw constant re-resolution of all these names on a rather rapid cadence - causing up to hundreds or thousands of DNS requests per second per pod, indefinitely.
Proposed changes to solve this properly:
Please let me know if a more thorough example of reproduction and measurement is desired