Skip to content

Add an i/o error label to http metrics#512

Merged
olix0r merged 2 commits intomasterfrom
ver/io-error
May 13, 2020
Merged

Add an i/o error label to http metrics#512
olix0r merged 2 commits intomasterfrom
ver/io-error

Conversation

@olix0r
Copy link
Member

@olix0r olix0r commented May 13, 2020

This change modifies HTTP error-labeling to detect I/O errors
and label them explicitly. Previously all I/O errors were reported as
unexpected.

Fixes linkerd/linkerd2#4364

@olix0r olix0r requested review from a team and adleong May 13, 2020 16:18
This change modifies HTTP error-labeling to detect I/O errors
and label them explicitly. Previously all I/O errors were reported as
`unexpected`.

Fixes linkerd/linkerd2#4364
@adleong
Copy link
Member

adleong commented May 13, 2020

Is this as granular as we can get here? Can we distinguish between different IO errors at all? For example, distinguish between connection refused and connection closed?

@hawkw
Copy link
Contributor

hawkw commented May 13, 2020

Is this as granular as we can get here? Can we distinguish between different IO errors at all? For example, distinguish between connection refused and connection closed?

We already have some code for formatting labels based on OS error codes (https://github.com/linkerd/linkerd2-proxy/blob/d213c1373f1f5ec479a59a0a388d4373dcb3cde4/linkerd/proxy/transport/src/metrics/errno.rs) that we're already using for TCP metrics:

pub fn record_close(&mut self, eos: Option<Errno>) {
let duration = self.opened_at.elapsed();
// When closed, the metrics structure is dropped so that no further
// updates can occur (i.e. so that an additional close won't be recorded
// on Drop).
if let Some(m) = self.metrics.take() {
let mut m = m.lock().expect("metrics registry poisoned");
m.open_connections.decr();
let class = m.by_eos.entry(Eos(eos)).or_insert_with(EosMetrics::default);
class.close_total.incr();
class.connection_duration.add(duration);
}
}

Possibly we could reuse this?

@olix0r
Copy link
Member Author

olix0r commented May 13, 2020

Is this as granular as we can get here? Can we distinguish between different IO errors at all? For example, distinguish between connection refused and connection closed?

You should never see a connection refused for http errors. We never dispatch requests to endpoints that are disconnected. This really can only indicate that a connection was terminated unexpectedly, as far as I understand.

If someone else would like to propose an alternative that handles more metadata, please do. If we want to get something into this week's edge, I think we should take this approach.

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, this doesn't hurt. Though, I'm still not totally clear on if it helps. I don't know what "unexpected" errors there could be previously other than IO errors.

@olix0r olix0r merged commit fd5a6aa into master May 13, 2020
@olix0r olix0r deleted the ver/io-error branch May 13, 2020 18:48
olix0r added a commit to linkerd/linkerd2 that referenced this pull request May 13, 2020
This release adds special handling for I/O errors in HTTP responses so
that an `errno` label is included to describe the underlying errors
in the proxy's metrics.

---

* Add an `i/o` error label to http metrics (linkerd/linkerd2-proxy#512)
olix0r added a commit that referenced this pull request May 13, 2020
This change modifies HTTP error-labeling to detect I/O errors
and label them explicitly. Previously all I/O errors were reported as
`unexpected`.

Additionally, an `errno` label is included when possible.

Fixes linkerd/linkerd2#4364
olix0r added a commit to linkerd/linkerd2 that referenced this pull request May 13, 2020
This release adds special handling for I/O errors in HTTP responses so
that an `errno` label is included to describe the underlying errors
in the proxy's metrics.

---

* Add an `i/o` error label to http metrics (linkerd/linkerd2-proxy#512)
olix0r added a commit that referenced this pull request May 14, 2020
This change modifies HTTP error-labeling to detect I/O errors
and label them explicitly. Previously all I/O errors were reported as
`unexpected`.

Additionally, an `errno` label is included when possible.

Fixes linkerd/linkerd2#4364
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Include io error message on http metrics

3 participants