Add Exemplar Remote Write support#8296
Add Exemplar Remote Write support#8296cstyan merged 26 commits intoprometheus:mainfrom cstyan:exemplar-remote-write
Conversation
|
What way do you want all of this reviewed? Everything in this PR? Also the exemplars have to go before the samples in the WAL, otherwise you can't tell if a sample at the very end of the WAL has an exemplar or not. |
|
I'd prefer to prioritize merging #6635, but Ganesh specifically asked for the WAL and remote write changes in a separate PR so he could review any changes to TSDB code more easily. If you want to review this PR then you only need to look at the most recent commit. Once #6635 is merged I'll rebase obviously.
Can you elaborate here, why is this a requirement and where is the code that says we can't match up a sample at the end of a segment and an exemplar at the beginning of a segment? Nothing I'm doing requires explicit matching of an exemplar to a sample in the WAL. |
|
Grand, is #6635 ready for review? You still seemed to be working on it.
With the current code, if the last entry in the WAL is a sample how do you know that you've now read everything from the WAL about that sample?
Just because you don't need it right now doesn't mean we should make it impossible for someone else who wants it in future. It's only reordering some code. The reason why this is important will be more apparent with metadata, but the same applies here so we should be consistent. |
There's certainly some work still left, but I think it's ready for another review. I've also responded to any unresolved comments.
The exemplar would at least have the scrape timestamp for lining it up with the sample, but I see how that could get messy, especially with the addition of metadata. I can see the reasoning behind reordering now since we would want to avoid having to change the WAL format in the future. I was mostly wondering if there was a use case or current requirement that I was breaking by not writing the exemplar record before the sample record, but seems that's not the case. |
I'll see about getting to that this week then.
Not specifically, but it may come up when I get to reviewing this PR. I believe I had made at least one comment touching on this general area - and taking a quick peek at this PR the question of how a exemplar is tied to a specific sample remains unresolved. |
codesome
left a comment
There was a problem hiding this comment.
A quick glance over the WAL part
|
I notice that the api returns {
"status": "success",
"data": null
}The query api does not return |
|
@codesome if you could have another look at the WAL code and format documentation and just let me know which open comments you think still need to be dealt with. I think there was initially some confusion around what the exemplar labels were (vs series labels). Regarding |
csmarchbanks
left a comment
There was a problem hiding this comment.
I only took a look at the remote side, the exemplar vs sample behavior is interesting, not sure how I feel about it quite yet.
|
@csmarchbanks let me know what you think of the latest commit, bit a refactor to include RW metrics for exemplars in a more correct way, instead of having some exemplars specific metrics and others that were named with |
bwplotka
left a comment
There was a problem hiding this comment.
Looking good, some comments so far 🤗
bwplotka
left a comment
There was a problem hiding this comment.
Looks good overall! (:
@cstyan @codesome how bad would be if we would add exemplar directly to the sample (as optional) thing, in proto? Just curious, what is better here. OpenMetrics defined it as inside sample: https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto#L357
cstyan
left a comment
There was a problem hiding this comment.
Addressed a few of the comments, but mostly replied to others. Seems like there might be some confusion as to why exemplar storage was implemented the way it was and how that impacts remote write, if there's still questions around this feel free to msg me on slack.
There are pros and cons for both from what I see. Depending on how the remote storage is setup, sending samples and exemplars in separate requests might be beneficial if exemplar storage could become a bottleneck for samples (but splitting that request could be taken care at remote storage too, but will it affect the write SLOs?). (that said, irrespective of that, I think Prometheus itself will wait before sending more samples if exemplar is taking time to preserve the order in a series?) And if we send samples and exemplars in same request, we could save some network bandwidth and costs by not duplicating the series labels :) Given the flexibility of remote storage, I feel having exemplar with the samples is better. |
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
…ith an exemplar. Fix dropping exemplar for missing series. Add tests for queue_manager sending exemplars Signed-off-by: Martin Disibio <[email protected]>
…ove the alloc when building final request, keep sub-slices in separate buffers for re-use Signed-off-by: Martin Disibio <[email protected]>
Signed-off-by: Martin Disibio <[email protected]>
Signed-off-by: Martin Disibio <[email protected]>
… correctly Signed-off-by: Martin Disibio <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
ms precision Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
names. Also refactor sample/exemplar to enqueue prompb types. Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
send exemplars over remote write. Signed-off-by: Callum Styan <[email protected]>
and log them all to WAL before adding them to exemplar storage. Signed-off-by: Callum Styan <[email protected]>
codesome
left a comment
There was a problem hiding this comment.
LGTM on the TSDB side! Last few comments
tsdb/exemplar.go
Outdated
| if err == storage.ErrOutOfOrderExemplar || err != storage.ErrDuplicateExemplar { | ||
| return err | ||
| } | ||
| // Duplicate exemplar, noop | ||
| return nil |
There was a problem hiding this comment.
How about this? Easier to read
| if err == storage.ErrOutOfOrderExemplar || err != storage.ErrDuplicateExemplar { | |
| return err | |
| } | |
| // Duplicate exemplar, noop | |
| return nil | |
| if err == storage.ErrDuplicateExemplar { | |
| // Duplicate exemplar, noop. | |
| return nil | |
| } | |
| return err |
tsdb/head.go
Outdated
| level.Warn(h.logger).Log("msg", "Unexpected error when replaying WAL on exemplar record", "err", err) | ||
| } | ||
| } | ||
| wg.Done() |
There was a problem hiding this comment.
Not a blocker nit: I suggest having this as defer at the beginning. Avoids any regression later.
| // At the moment the only possible error here is out of order exemplars, which we shouldn't see when | ||
| // replaying the WAL, so lets just log the error if it's not that type. | ||
| err = h.exemplars.AddExemplar(ms.lset, exemplar.Exemplar{Ts: e.T, Value: e.V, Labels: e.Labels}) | ||
| if err != nil && err == storage.ErrOutOfOrderExemplar { |
There was a problem hiding this comment.
Did you mean this? (Also should we check for duplicate error too?)
| if err != nil && err == storage.ErrOutOfOrderExemplar { | |
| if err != nil && err != storage.ErrOutOfOrderExemplar { |
There was a problem hiding this comment.
exemplarStorage.AddExemplar will never actually return the duplicate error, since we don't really treat it as an error (duplicates are expected, just not ingested).
There was a problem hiding this comment.
Did you mean this? also yes
Signed-off-by: Callum Styan <[email protected]>
Signed-off-by: Callum Styan <[email protected]>
csmarchbanks
left a comment
There was a problem hiding this comment.
👍 From the remote write side.
roidelapluie
left a comment
There was a problem hiding this comment.
2 minor comments, so small they can be addressed later.
The rest LGTM.
Signed-off-by: Callum Styan <[email protected]>
Moved changes related to remote write (including writing exemplars to the WAL) to this PR.