Skip to content

Consider making canonicalize DNS timeout tuneable #2093

@hawkw

Description

@hawkw

Feature Request

What problem are you trying to solve?

Currently, the timeout for DNS queries to canonicalize a name is hard-coded to 100ms in the proxy:
https://github.com/linkerd/linkerd2-proxy/blob/08b2a23ca86571ea1fe8e3579e5dbb8b1c0df4e3/src/proxy/canonicalize.rs#L77

In some cases (see #2069), a DNS server might be slow enough to respond that a majority of these queries fail. This results in a lot of error messages being logged, and potentially, a delay or complete failure to update to changes in DNS, if the DNS server is consistently very slow.

How should the problem be solved?

We should probably make this tuneable. We can add a LINKERD2_PROXY_DNS_CANONICALIZE_TIMEOUT environment variable (or similar).

Potential drawback: it's another knob to turn.

Any alternatives you've considered?

We could also just bump up the default timeout. IMO, this is not a great idea because it seems like every time we bump up a timeout, someone shows up with a situation where the timeout is too short. Making it configurable gives users the power to fix their own issues without a code change.

Alternatively we could do nothing. In most cases, presumably a query will eventually finish in under the timeout. This would result in a delay to respond to updates, and a bunch of scary log lines.

How would users interact with this feature?

Probably an inject/install-time flag to set the timeout.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions