-
Notifications
You must be signed in to change notification settings - Fork 1.3k
Description
Feature Request
What problem are you trying to solve?
Currently, the timeout for DNS queries to canonicalize a name is hard-coded to 100ms in the proxy:
https://github.com/linkerd/linkerd2-proxy/blob/08b2a23ca86571ea1fe8e3579e5dbb8b1c0df4e3/src/proxy/canonicalize.rs#L77
In some cases (see #2069), a DNS server might be slow enough to respond that a majority of these queries fail. This results in a lot of error messages being logged, and potentially, a delay or complete failure to update to changes in DNS, if the DNS server is consistently very slow.
How should the problem be solved?
We should probably make this tuneable. We can add a LINKERD2_PROXY_DNS_CANONICALIZE_TIMEOUT environment variable (or similar).
Potential drawback: it's another knob to turn.
Any alternatives you've considered?
We could also just bump up the default timeout. IMO, this is not a great idea because it seems like every time we bump up a timeout, someone shows up with a situation where the timeout is too short. Making it configurable gives users the power to fix their own issues without a code change.
Alternatively we could do nothing. In most cases, presumably a query will eventually finish in under the timeout. This would result in a delay to respond to updates, and a bunch of scary log lines.
How would users interact with this feature?
Probably an inject/install-time flag to set the timeout.