canonicalize: Only log errors at the WARN level when falling back#174
Merged
canonicalize: Only log errors at the WARN level when falling back#174
Conversation
Signed-off-by: Eliza Weisman <[email protected]>
Contributor
Author
|
I did also consider changing this code to not log the same error multiple times if it occurs repeatedly to try and reduce the log spam seen in linkerd/linkerd2#2069. However, this implied adding enough additional state to the canonicalize task that I wasn't sure whether or not it was worth it. |
olix0r
approved these changes
Jan 18, 2019
hawkw
added a commit
to linkerd/linkerd2
that referenced
this pull request
Jan 24, 2019
proxy: update pinned version to 5b507a9 This picks up the following proxy commits: * eaabc48 Update tower-grpc * e9561de Update h2 to 0.1.16 * 28fd5e7 Add Route timeouts (linkerd/linkerd2-proxy#165) * 5637372 Re-flag tcp_duration tests as flaky * 20cbd18 Revise several log levels and messages (linkerd/linkerd2-proxy##177) * ae16978 Remove flakiness from 'profiles' tests * 49c29cd canonicalize: Only log errors at the WARN level when falling back (linkerd/linkerd2-proxy#174) * 486dd13 Make outbound router honor `l5d-dst-override` header (linkerd/linkerd2-proxy#173) * 7adc50d Make timeouts for canonicalization DNS queries tuneable (linkerd/linkerd2-proxy#175) * 3188179 Try reducing CI flakiness by reducing RUST_TEST_THREADS to 1 Some of these changes will probably need changelog entries: * Improve logging when rejecting malformed HTTP/2 pseudo-headers (hyperium/h2#347) * Improve logging for gRPC errors (tower-rs/tower-grpc#111) * Add Route timeouts (linkerd/linkerd2-proxy#165) * Downgrade several of the noisiest log messages to TRACE (linkerd/linkerd2-proxy##177) * Add an environment variable for configuring the DNS canonicalization timeout (linkerd/linkerd2-proxy#175) * Make outbound router honor `l5d-dst-override` header (linkerd/linkerd2-proxy#173) Perhaps all the logging related changes can be grouped into one changelog entry, though... Signed-off-by: Eliza Weisman <[email protected]>
hawkw
added a commit
to linkerd/linkerd2
that referenced
this pull request
Jan 24, 2019
This picks up the following proxy commits: * eaabc48 Update tower-grpc * e9561de Update h2 to 0.1.16 * 28fd5e7 Add Route timeouts (linkerd/linkerd2-proxy#165) * 5637372 Re-flag tcp_duration tests as flaky * 20cbd18 Revise several log levels and messages (linkerd/linkerd2-proxy##177) * ae16978 Remove flakiness from 'profiles' tests * 49c29cd canonicalize: Only log errors at the WARN level when falling back (linkerd/linkerd2-proxy#174) * 486dd13 Make outbound router honor `l5d-dst-override` header (linkerd/linkerd2-proxy#173) * 7adc50d Make timeouts for canonicalization DNS queries tuneable (linkerd/linkerd2-proxy#175) * 3188179 Try reducing CI flakiness by reducing RUST_TEST_THREADS to 1 Some of these changes will probably need changelog entries: * Improve logging when rejecting malformed HTTP/2 pseudo-headers (hyperium/h2#347) * Improve logging for gRPC errors (tower-rs/tower-grpc#111) * Add Route timeouts (linkerd/linkerd2-proxy#165) * Downgrade several of the noisiest log messages to TRACE (linkerd/linkerd2-proxy##177) * Add an environment variable for configuring the DNS canonicalization timeout (linkerd/linkerd2-proxy#175) * Make outbound router honor `l5d-dst-override` header (linkerd/linkerd2-proxy#173) Perhaps all the logging related changes can be grouped into one changelog entry, though... Signed-off-by: Eliza Weisman <[email protected]>
sprt
pushed a commit
to sprt/linkerd2-proxy
that referenced
this pull request
Aug 30, 2019
…#174) We added basic prometheus instrumentation, but this only encapsulated basic go metrics and request counts. This adds latency and response size metrics exporting as well, to the public-api server, theweb server and the telemetry server. Since the util function in grpc.go was basically used to wrap the server creation in a prometheus handler, I added the other prometheus constants in there and renamed the file to prometheus.go. - Add request duration and response size instrumentation to web and public api - Also add latency monitoring to telemetry service requests - Rename util/grpc.go to util/prometheus.go
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Previously, whenever the
canonicalize::Taskfuture encounters anerror, it logs that error at the error level. However, in many cases,
these errors are transient, and we are able to successfully fall back to
a previously canonicalized name.
This branch changes
canonicalize::Taskto only log at the error levelwhen there's no previously successful result to fall back to, and log at
the warning level otherwise. In addition, the log message on fallbacks
now indicates that we fell back to a previous canonicalization.
Hopefully, this should make transient errors, such as a slow DNS server,
a little less scary.
Fixes linkerd/linkerd2#2094
Signed-off-by: Eliza Weisman [email protected]