Skip to content

Update trust-dns-resolver to fix UDP socket leak#169

Merged
hawkw merged 1 commit intomasterfrom
eliza/update-resolver
Jan 10, 2019
Merged

Update trust-dns-resolver to fix UDP socket leak#169
hawkw merged 1 commit intomasterfrom
eliza/update-resolver

Conversation

@hawkw
Copy link
Contributor

@hawkw hawkw commented Jan 5, 2019

An upstream bug in the trust-dns-proto library can cause
trust-dns-resolver to leak UDP sockets when DNS queries time out. This
issue appears to be the cause of the memory leak described in
linkerd/linkerd2#2012.

This branch updates the trust-dns dependency to pick up the change in
bluejekyll/trust-dns#635, which fixes the UDP socket leak.

I confirmed that the socket leak was fixed by modifying the proxy to
hard-code a 0-second DNS timeout, sending requests to the proxy's
outbound listener, and using

lsof -p $(pgrep linkerd2-proxy)

to count the number of open UDP sockets. On master, every request to a
different DNS name that times out leaves behind an additional open UDP
socket, which show up in lsof, while on this branch, only TCP sockets
remain open after the request ends.

In addition, I'm running a test in GCP to watch the memory and file
descriptor use of the proxy over a long period of time. This is still in
progress, but given the above, I strongly believe this branch fixes the
leak.

Fixes linkerd/linkerd2#2012.

Signed-off-by: Eliza Weisman [email protected]

This should pick up the change in bluejekyll/trust-dns#635, which fixes a
UDP socket leak (bluejekyll/trust-dns#633). I suspect this is likely the
cause of linkerd/linkerd2#2012.

Signed-off-by: Eliza Weisman <[email protected]>
@hawkw hawkw added the bug Something isn't working label Jan 5, 2019
@hawkw hawkw self-assigned this Jan 5, 2019
@hawkw hawkw requested review from olix0r and seanmonstar January 5, 2019 00:15

[[package]]
name = "rand"
version = "0.6.3"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rand 0.4, 0.5, and 0.6... 😭

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh boy, I hadn't noticed. It would be nice if we could do something about that...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the rand 0.4 dependency is pulled in by tempdir, which is a dependency of prost-build. The tempdir crate is deprecated in favour of tempfile, but we use it in our tests as well (to pull in fewer dependencies).

I'll look at a patch to prost-build replacing tempdir with tempfile, and hopefully we can get rid of rand 0.4

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@seanmonstar if we merge both this branch and #170, we will depend exclusively on the latest rand (0.6.3).

Sorry it took a bit, the dependency update was a surprisingly large amount of work.

@hawkw hawkw merged commit 08b2a23 into master Jan 10, 2019
hawkw added a commit to linkerd/linkerd2 that referenced this pull request Jan 10, 2019
proxy: bump version to fix memory leak

- Update to trust-dns-resolver 0.10.1 (linkerd/linkerd2-proxy#169)
hawkw added a commit to linkerd/linkerd2 that referenced this pull request Jan 10, 2019
- Update to trust-dns-resolver 0.10.1 (linkerd/linkerd2-proxy#169)

Signed-off-by: Eliza Weisman <[email protected]>
hawkw added a commit that referenced this pull request Feb 1, 2019
This branch updates the proxy's dependencies to eliminate multiple
incompatible versions of the `rand` crate. 

Previously we depended on both `rand` 0.4.3 and 0.5.1. After this
branch, we depend on `rand` 0.6.3 (the latest) and 0.5.1. However, PR
#169 will also update `trust-dns-resolver` (the only crate depending on
`rand` 0.5.1) to a version that depends on `rand` 0.6.3, so once both
this branch and #169 are merged, we will depend only on `rand` 0.6.3.

Note that this is a fairly large change to `Cargo.toml` --- it was
necessary to update many of the proxy's dependencies in order to
consolidate on one `rand` version. Additionally, I had to push branches
of some of those dependencies in order to update their `rand`
dependency, so it's currently necessary to patch some of the proxy's
dependencies. When those branches merge upstream, the patches can be
removed.

Signed-off-by: Eliza Weisman <[email protected]>
@olix0r olix0r deleted the eliza/update-resolver branch August 17, 2019 01:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Possible Memory Leak in Proxy Sidecar

2 participants