Skip to content

Add Exemplar Remote Write support#8296

Merged
cstyan merged 26 commits intoprometheus:mainfrom
cstyan:exemplar-remote-write
May 6, 2021
Merged

Add Exemplar Remote Write support#8296
cstyan merged 26 commits intoprometheus:mainfrom
cstyan:exemplar-remote-write

Conversation

@cstyan
Copy link
Copy Markdown
Member

@cstyan cstyan commented Dec 16, 2020

Moved changes related to remote write (including writing exemplars to the WAL) to this PR.

@brian-brazil
Copy link
Copy Markdown
Contributor

What way do you want all of this reviewed? Everything in this PR?

Also the exemplars have to go before the samples in the WAL, otherwise you can't tell if a sample at the very end of the WAL has an exemplar or not.

@cstyan
Copy link
Copy Markdown
Member Author

cstyan commented Dec 16, 2020

I'd prefer to prioritize merging #6635, but Ganesh specifically asked for the WAL and remote write changes in a separate PR so he could review any changes to TSDB code more easily. If you want to review this PR then you only need to look at the most recent commit. Once #6635 is merged I'll rebase obviously.

Also the exemplars have to go before the samples in the WAL, otherwise you can't tell if a sample at the very end of the WAL has an exemplar or not.

Can you elaborate here, why is this a requirement and where is the code that says we can't match up a sample at the end of a segment and an exemplar at the beginning of a segment?

Nothing I'm doing requires explicit matching of an exemplar to a sample in the WAL.

@brian-brazil
Copy link
Copy Markdown
Contributor

Grand, is #6635 ready for review? You still seemed to be working on it.

Can you elaborate here, why is this a requirement and where is the code that says we can't match up a sample at the end of a segment and an exemplar at the beginning of a segment?

With the current code, if the last entry in the WAL is a sample how do you know that you've now read everything from the WAL about that sample?

Nothing I'm doing requires explicit matching of an exemplar to a sample in the WAL.

Just because you don't need it right now doesn't mean we should make it impossible for someone else who wants it in future. It's only reordering some code. The reason why this is important will be more apparent with metadata, but the same applies here so we should be consistent.

@cstyan
Copy link
Copy Markdown
Member Author

cstyan commented Dec 16, 2020

Grand, is #6635 ready for review? You still seemed to be working on it.

There's certainly some work still left, but I think it's ready for another review. I've also responded to any unresolved comments.

With the current code, if the last entry in the WAL is a sample how do you know that you've now read everything from the WAL about that sample?

Just because you don't need it right now doesn't mean we should make it impossible for someone else who wants it in future. It's only reordering some code. The reason why this is important will be more apparent with metadata, but the same applies here so we should be consistent.

The exemplar would at least have the scrape timestamp for lining it up with the sample, but I see how that could get messy, especially with the addition of metadata. I can see the reasoning behind reordering now since we would want to avoid having to change the WAL format in the future. I was mostly wondering if there was a use case or current requirement that I was breaking by not writing the exemplar record before the sample record, but seems that's not the case.

@brian-brazil
Copy link
Copy Markdown
Contributor

There's certainly some work still left, but I think it's ready for another review. I've also responded to any unresolved comments.

I'll see about getting to that this week then.

I was mostly wondering if there was a use case or current requirement that I was breaking by not writing the exemplar record before the sample record, but seems that's not the case.

Not specifically, but it may come up when I get to reviewing this PR. I believe I had made at least one comment touching on this general area - and taking a quick peek at this PR the question of how a exemplar is tied to a specific sample remains unresolved.

Copy link
Copy Markdown
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A quick glance over the WAL part

@roidelapluie
Copy link
Copy Markdown
Member

I notice that the api returns

{
  "status": "success",
  "data": null
}

The query api does not return null, but returns [], which I find better.

Base automatically changed from master to main February 23, 2021 19:36
@cstyan
Copy link
Copy Markdown
Member Author

cstyan commented Mar 16, 2021

@codesome if you could have another look at the WAL code and format documentation and just let me know which open comments you think still need to be dealt with. I think there was initially some confusion around what the exemplar labels were (vs series labels).

Regarding Can we get rid of this first row? The first exemplar could have the full ID and the timestamp, and remaining examplars to have the delta. , my question would be is there a reason you think we should change the format that way, or does it just seem like the first row is unnecessary? I just based the exemplars records on samples records, with the addition of the exemplars label set. I don't have any context for why those records were formatted that way vs what you're proposing.

Copy link
Copy Markdown
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only took a look at the remote side, the exemplar vs sample behavior is interesting, not sure how I feel about it quite yet.

@cstyan
Copy link
Copy Markdown
Member Author

cstyan commented Apr 8, 2021

@csmarchbanks let me know what you think of the latest commit, bit a refactor to include RW metrics for exemplars in a more correct way, instead of having some exemplars specific metrics and others that were named with samples but included both.

Copy link
Copy Markdown
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial review, in general looks fine

Edit: I did not check the remote write part

@bwplotka bwplotka changed the title WIP Exemplar remote write Add Exemplar Remote Write support Apr 11, 2021
Copy link
Copy Markdown
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, some comments so far 🤗

Copy link
Copy Markdown
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall! (:

@cstyan @codesome how bad would be if we would add exemplar directly to the sample (as optional) thing, in proto? Just curious, what is better here. OpenMetrics defined it as inside sample: https://github.com/open-telemetry/opentelemetry-proto/blob/main/opentelemetry/proto/metrics/v1/metrics.proto#L357

Copy link
Copy Markdown
Member Author

@cstyan cstyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed a few of the comments, but mostly replied to others. Seems like there might be some confusion as to why exemplar storage was implemented the way it was and how that impacts remote write, if there's still questions around this feel free to msg me on slack.

@codesome
Copy link
Copy Markdown
Member

how bad would be if we would add exemplar directly to the sample (as optional) thing, in proto?

There are pros and cons for both from what I see.

Depending on how the remote storage is setup, sending samples and exemplars in separate requests might be beneficial if exemplar storage could become a bottleneck for samples (but splitting that request could be taken care at remote storage too, but will it affect the write SLOs?). (that said, irrespective of that, I think Prometheus itself will wait before sending more samples if exemplar is taking time to preserve the order in a series?)

And if we send samples and exemplars in same request, we could save some network bandwidth and costs by not duplicating the series labels :)

Given the flexibility of remote storage, I feel having exemplar with the samples is better.

cstyan and others added 18 commits May 5, 2021 22:25
Signed-off-by: Callum Styan <[email protected]>
…ith an exemplar. Fix dropping exemplar for missing series. Add tests for queue_manager sending exemplars

Signed-off-by: Martin Disibio <[email protected]>
…ove the alloc when building final request, keep sub-slices in separate buffers for re-use

Signed-off-by: Martin Disibio <[email protected]>
names. Also refactor sample/exemplar to enqueue prompb types.

Signed-off-by: Callum Styan <[email protected]>
send exemplars over remote write.

Signed-off-by: Callum Styan <[email protected]>
and log them all to WAL before adding them to exemplar storage.

Signed-off-by: Callum Styan <[email protected]>
Copy link
Copy Markdown
Member

@codesome codesome left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on the TSDB side! Last few comments

tsdb/exemplar.go Outdated
Comment on lines +210 to +214
if err == storage.ErrOutOfOrderExemplar || err != storage.ErrDuplicateExemplar {
return err
}
// Duplicate exemplar, noop
return nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this? Easier to read

Suggested change
if err == storage.ErrOutOfOrderExemplar || err != storage.ErrDuplicateExemplar {
return err
}
// Duplicate exemplar, noop
return nil
if err == storage.ErrDuplicateExemplar {
// Duplicate exemplar, noop.
return nil
}
return err

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤦

tsdb/head.go Outdated
level.Warn(h.logger).Log("msg", "Unexpected error when replaying WAL on exemplar record", "err", err)
}
}
wg.Done()
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker nit: I suggest having this as defer at the beginning. Avoids any regression later.

// At the moment the only possible error here is out of order exemplars, which we shouldn't see when
// replaying the WAL, so lets just log the error if it's not that type.
err = h.exemplars.AddExemplar(ms.lset, exemplar.Exemplar{Ts: e.T, Value: e.V, Labels: e.Labels})
if err != nil && err == storage.ErrOutOfOrderExemplar {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean this? (Also should we check for duplicate error too?)

Suggested change
if err != nil && err == storage.ErrOutOfOrderExemplar {
if err != nil && err != storage.ErrOutOfOrderExemplar {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exemplarStorage.AddExemplar will never actually return the duplicate error, since we don't really treat it as an error (duplicates are expected, just not ingested).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean this? also yes

Copy link
Copy Markdown
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 From the remote write side.

Copy link
Copy Markdown
Member

@roidelapluie roidelapluie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 minor comments, so small they can be addressed later.

The rest LGTM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants