Skip to content

core: fix missing spans in serve tasks#1243

Merged
olix0r merged 3 commits intomainfrom
eliza/fix-accept-span
Sep 4, 2021
Merged

core: fix missing spans in serve tasks#1243
olix0r merged 3 commits intomainfrom
eliza/fix-accept-span

Conversation

@hawkw
Copy link
Contributor

@hawkw hawkw commented Sep 4, 2021

Currently, the serve function spawns tasks instrumented with a
tracing span for each accepted connection. This span is at the DEBUG
level, so it's disabled with the default tracing configuration. This is
fine, as the per-connection context is relatively verbose and is mainly
useful for debugging.

However, we also rely on this span for propagating the INFO-level
spans which indicate which part of the proxy an event occurred in
(inbound, outbound, etc). When the DEBUG span is enabled, it will be a
child of these spans, so they are propagated to the spawned tasks.
However, when the DEBUG span is not enabled, nothing propagates the
INFO spans. Since the default tracing configuration enables INFO
but not DEBUG, we want those spans to be propagated to the tasks
spawned in serve.

This commit fixes the missing spans by moving the spawn inside of the
Span::in_scope call, and using in_current_span rather than
instrument. Now, if the per-connection DEBUG span is enabled, it
will be the current span...but if it isn't, the INFO span will still
be current, so the task will still have the INFO span as part of its
span context regardless.

Alternatively, we could have fixed this by changing the instrument()
call to:

    .instrument(span)
    .in_current_span()

so that the task is always spawned in both the DEBUG span and the
current span. However, this is a bit less efficient, as it wraps the
tasks in both spans even when the INFO span is not needed, so every
time the task is polled, we would enter both spans.

Currently, the `serve` function spawns tasks `instrument`ed with a
`tracing` span for each accepted connection. This span is at the `DEBUG`
level, so it's disabled with the default tracing configuration. This is
fine, as the per-connection context is relatively verbose and is mainly
useful for debugging.

However, we also rely on this span for propagating the `INFO`-level
spans which indicate which part of the proxy an event occurred in
(inbound, outbound, etc). When the `DEBUG` span is enabled, it will be a
child of these spans, so they are propagated to the spawned tasks.
However, when the `DEBUG` span is _not_ enabled, nothing propagates the
`INFO` spans. Since the default `tracing` configuration enables `INFO`
but not `DEBUG`, we want those spans to be propagated to the tasks
spawned in `serve`.

This commit fixes the missing spans by moving the spawn inside of the
`Span::in_scope` call, and using `in_current_span` rather than
`instrument`. Now, if the per-connection `DEBUG` span is enabled, it
will be the current span...but if it isn't, the `INFO` span will still
be current, so the task will still have the `INFO` span as part of its
span context regardless.

Alternatively, we could have fixed this by changing the `instrument()`
call to:
```rust
    .instrument(span)
    .in_current_span()
```
so that the task is always spawned in both the `DEBUG` span _and_ the
current span. However, this is a bit less efficient, as it wraps the
tasks in both spans even when the `INFO` span is not needed, so every
time the task is polled, we would enter both spans.
@hawkw hawkw requested a review from olix0r September 4, 2021 00:06
@hawkw hawkw self-assigned this Sep 4, 2021
@hawkw hawkw requested a review from a team September 4, 2021 00:06
@olix0r olix0r merged commit 0f3328a into main Sep 4, 2021
@olix0r olix0r deleted the eliza/fix-accept-span branch September 4, 2021 01:37
@hawkw
Copy link
Contributor Author

hawkw commented Sep 4, 2021

upstream PR for a small API addition that should help avoid this in the future: tokio-rs/tracing#1538

olix0r added a commit to linkerd/linkerd2 that referenced this pull request Sep 9, 2021
This release improves error handling so that HTTP metrics include 5XX
responses for common errors.

Logging has also been improved to ensure `inbound` and `outbound` spans
are always present in log messages.

Outbound tap has been fixed to include route labels when service
profiles are configured.

---

* Set tracing spans on policy client (linkerd/linkerd2-proxy#1241)
* build(deps): bump tokio-util from 0.6.7 to 0.6.8 (linkerd/linkerd2-proxy#1240)
* core: fix missing spans in `serve` tasks (linkerd/linkerd2-proxy#1243)
* build(deps): bump thiserror from 1.0.28 to 1.0.29 (linkerd/linkerd2-proxy#1244)
* orig-proto: Avoid emiting HTTP/2 errors for upgraded requests (linkerd/linkerd2-proxy#1245)
* Fix route labels on outbound tap metadata (linkerd/linkerd2-proxy#1247)
* errors: Support contextual error handling strategies (linkerd/linkerd2-proxy#1246)
olix0r added a commit to linkerd/linkerd2 that referenced this pull request Sep 9, 2021
This release improves error handling so that HTTP metrics include 5XX
responses for common errors.

Logging has also been improved to ensure `inbound` and `outbound` spans
are always present in log messages.

Outbound tap has been fixed to include route labels when service
profiles are configured.

---

* Set tracing spans on policy client (linkerd/linkerd2-proxy#1241)
* build(deps): bump tokio-util from 0.6.7 to 0.6.8 (linkerd/linkerd2-proxy#1240)
* core: fix missing spans in `serve` tasks (linkerd/linkerd2-proxy#1243)
* build(deps): bump thiserror from 1.0.28 to 1.0.29 (linkerd/linkerd2-proxy#1244)
* orig-proto: Avoid emiting HTTP/2 errors for upgraded requests (linkerd/linkerd2-proxy#1245)
* Fix route labels on outbound tap metadata (linkerd/linkerd2-proxy#1247)
* errors: Support contextual error handling strategies (linkerd/linkerd2-proxy#1246)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants