Exemplars store in-memory storage interface by shreyassrivatsan · Pull Request #6309 · prometheus/prometheus

shreyassrivatsan · 2019-11-13T01:38:36Z

Initial pass at passing exemplars through the scrape logic.

This is one option we could take by changing the Appender interface Add, AddFast methods. The current draft shows the changes involved at a high level. Still need to deal with optimally handling the no exemplar case.

An alternate approach we could take is to not change the current Appender interface, but add a new interface AppenderWithExemplars and then upcast the appender provided.. appender, ok := Appender.(AppenderWithExemplar) and decide whether to handle exemplars based on what type of appender has been configure..

There are few cases to take care of here

Metric does not have exemplar, appender does not support exemplar
Metric has exemplar, appender does not support exemplars - we should ideally not even parse and propagate the exemplar.
Metric has exemplar, appender supports exemplars - push the exemplars through the scraper

brian-brazil · 2019-11-13T08:13:41Z

Exemplars aren't going to the TSDB, so shouldn't affect the appender interface.

juliusv · 2019-11-13T11:42:09Z

[FYI: the build on this is broken due to an unrelated TypeScript error, looking into that]

shreyassrivatsan · 2019-11-13T16:54:36Z

Exemplars aren't going to the TSDB, so shouldn't affect the appender interface.

Is the plan to only store them in-memory in the scraper? It seems like it would be nice to be able to store them in a TSDB or a more persistent store..

brian-brazil · 2019-11-13T16:59:59Z

Yes. The data volumes make storing on disk impractical.

shreyassrivatsan · 2019-11-13T22:47:54Z

Yes. The data volumes make storing on disk impractical.

@brian-brazil what are your thoughts on the updated approach in the PR to propagate exemplars? https://github.com/prometheus/prometheus/pull/6309/files#diff-3bbf5d3205ec4ba63d9de26d3e31c776R24

The implementation itself for now is just a placeholder..

brian-brazil · 2019-11-13T23:44:47Z

They shouldn't hit the storage interface at all.

robskillington · 2019-11-14T14:06:59Z

@brian-brazil I think the new proposal from @shreyassrivatsan is just for in-memory collection. The existing storage/queryable interfaces are not changed/modified/touched. The words "Storage" and "Queryable" don't need to be a part of the interfaces proposed, it's just they are generic names picked for reuse due to their similarity to other parts of the code base.

i.e. the proposal is some simple struct that receives exemplars, can cycle them out of memory based on a fixed size (or other policy), and can be used to query/retrieve them for a matching label set.

This struct is then provided to scrape.NewManager(...) as an optional data structure. If it was provided then the scrape manager calls exemplarStore.Add(...) when it parses an exemplar. Then it can be queried with exemplarStore.Get(...) and exemplarStore.Query(...) to retrieve the in-memory samples at a later time.

What are your thoughts on that proposal?

brian-brazil · 2019-11-14T14:56:53Z

We should avoid clashing with the names of existing important interfaces.

beorn7 · 2019-11-25T21:36:39Z

@shreyassrivatsan could you rebase this now that #6292 is merged?

beorn7 · 2019-11-25T21:36:53Z

Which should also fix test errors.

shreyassrivatsan · 2019-12-03T18:56:13Z

@shreyassrivatsan could you rebase this now that #6292 is merged?

Rebased and pushed just the interface changes..

cstyan

Looks fine but I'll have to sit and think again about the in memory storage and get back to you.

Initially the Add function looked off to me as well, as it takes labels but the Exemplar struct already has labels. Can someone clarify, or point me to docs as I haven't been able to find any, what the exact exemplar format is? How come an exemplar have it's own label set when it's associated with an existing timeseries?

cstyan · 2019-12-10T05:40:46Z

pkg/exemplar/interface.go

Get(...) should return []Exemplar, multiple exemplars could match l if not many labels are passed.

So the idea was it has to be a full equality match on labels. The labels here describe the exact metric including all the labels and the name, so there should only be one. Otherwise the lookup is not that useful. We need to be able to find the exemplar corresponding to the metric.

I can't speak for everyone else but I think we should follow the same pattern as existing queries; return all matches for the labels provided. That doesn't exclude users from passing labels that would match only a single exemplar, but is better than error; query for exemplar would return more than one result in the case of the current functions IMO.

Ok.. But in that case the return value will have to change.. It will additionally have to include information of the actual series matched as the exemplar is not useful without the series itself. So returns something like

type SeriesExemplar struct { l labels.Labels e Exemplar }

and the method becomes Get(l labels.Labels, t int64) ([]SeriesExemplar, bool, error)

I assume @shreyassrivatsan 's original idea was to keep things really as simple as possible. With exact matches, we can just have all the exemplars in a simple hash map. With label matching, we'll find ourselves once more in a situation where we have to implement a database (or do full scans of the exemplars in memory). Is that part of the plan?

Could we perhaps quickly think through the typical query pattern for exemplars? Will we commonly need label matching at all?

I think a Grafana dashboard is going to be a pretty common usecase. If a dashboard panel has a single query for something like http_request_duration_seconds, without enough labels to narrow the query down to a single series, I don't think it's reasonable for Grafana to have to do the work to split all the results into individual label sets and make a query per unique series.

There's also the issue that many histogram queries aggregate away labels that we'd need to uniquely identify which exemplar to select.

@shreyassrivatsan can you elaborate on your use case and what M3 needs from the API, and hopefully we can get some more details from @davkal as well.

Ah. So I did not mean that grafana makes a query and then retrieves the exemplars. I meant a prometheus query will first query the series, get the exemplars and then apply functions.. I am simplifying a lot here, but essentially /api/v1/query and /api/v1/query_range will do

func query(query string) series { ls := querySeries() // returns []labels.Labels if exemplarsRequested { for _, l := range ls { queryExemplar(l) } } // apply functions }

The resulting response will probably contain something maybe like the below

{ "status" : "success", "data" : { "resultType" : "vector", "result" : [ { "metric" : { "__name__" : "up", "job" : "prometheus", "instance" : "localhost:9090" }, "value": [ 1435781451.781, "1", {<exemplar>} ] } ] } }

Eventually for full exemplar support , functions would have to be updated to handle exemplars. For example, aggregation functions can have a pick one policy.

I think this approach makes a lot of sense. However, it requires to intertwine the existing API with exemplar support. My impression from the Prometheus side of things was that the plan is to implement something experimental that is a completely separate API so that consumers of that API can play with it and prove or disprove its usefulness. We had a few chats now with @juliusv and @davkal, wherein we realized that it is actually very hard for a visualization frontend like Grafana to craft dedicated queries for exemplars (for example in case the user clicks on a peak of a latency graph or on a tile of a histogram heatmap).

Returning exemplars alongside normal queries, if so requested, is solving that problem, but it's quite invasive (as the exemplar support has to be piped through the whole query engine, including smart decisions what to pick for aggregations). If we went down that alley, the exact-match method as suggested here would indeed make sense as illustrated by your pseudo code. (We might want to use a time range and not just a single timestamp in the signature then.)

While I like the idea, I'm pretty sure amending everything in the query engine is not feasible for an experimental feature. I'm still thinking about ways how to do something for the MVP that leaves the current query API and engine as is…

Ok, that makes sense. And yes, this is going more towards what will be the right solution if we decide to pipe this through the query engine. I still do think the exact timestamp makes for a faster implementation as we will have access to that as a part of the results for a time series query.

For an experimental implementation using a time range and partial label match is probably a better approach, but implementing the backing store will be some more work. I made some changes to the interface. Get still does exact matches for a given timestamp. Query does partial label matches, takes a time range and returns a list of SeriesExemplars.

My idea for how the (MVP) query flow and UI in Grafana could look like:

The Grafana user creating a dashboard would be required to pair each graphing query with an additional "query", which would be required to just be a PromQL label selector without any aggregations or functions, e.g. request_latency_bucket{code=~"2..",job="foo"}. The exemplar API would take exactly that query string (i.e. a naked label matcher in PromQL syntax) and a time range to return exemplars, possibly sampled to avoid too many return values. Grafana could then use the provided string to query for exemplars upon a click event (could include the le label when in a histogram heatmap, but would also work without it when clicking on a latency graph). Further server-side filtering by exemplar or specific downsampling value would not be in the MVP but could be done in the browser for now.

A selection of exemplars with links to logs or traces could then be shown in a pop-up, or exemplars could show up as clickable "stars" in a latency graph.

@davkal does that make sense?

brian-brazil · 2019-12-10T08:21:38Z

How come an exemplar have it's own label set when it's associated with an existing timeseries?

It's for things a trace id for the exemplar.

beorn7 · 2019-12-16T21:34:57Z

pkg/exemplar/interface.go

Perhaps this should be a struct then?

type SeriesExemplar struct{ Labels labels.Labels TsExemplars []TsExemplar }

In that way, the consumer can actually see what labels matched the labels in the query.

beorn7 · 2019-12-16T21:37:11Z

pkg/exemplar/interface.go

To really implement a generic query API, we needed regexp matchers and negative matchers, too.

That's where I'm thinking we should perhaps stay with the Get only approach, just that it takes a time range, not just one timestamp, and returns a time series of matchers for that one exact label set. The implementation of the externally visible API would need to hit the normal storage first to resolve the label matchers into concrete series first. I think that's fine for an MVP.

Couldn't we do this via Query(start, end int64, matchers ...*labels.Matcher)?

Sure, but then we are really talking about implementing a whole query engine.
By doing this via a call to the existing query engine, we would waste a bit of resources (as we only want to get the matching label sets and would never access the actual time series data), but it would be much simpler.

Also, we didn't need what I commented above (the separate struct to show in the answer which exact label set is attached to each returned series of exemplars).

One more round of changes. I got rid of Query for the time being as query in a way indicates partial matches. Added a GetRange which takes a range and returns a list of exemplars based on exact label matches.

Signed-off-by: Shreyas Srivatsan <[email protected]>

beorn7 · 2019-12-18T16:59:27Z

Based on a couple of chats some people involved here had, I have to conclude that there are quite different opinions how this interface should look like. I was aiming for "as simple as possible, to be used as the ultimate underlying storage primitive", but one might as well make the argument that the storage interface should be rich in features and then the implementation might decide to re-use existing indexing and matching functionality or run their own.

It might just be wrong to start with the design of the internal interface. Perhaps it's better to get a document together for the big picture how everything plays together (exposition → ingestion → storage update on the one hand, and then Grafana panel → query for exemplars → storage retrieval).

brian-brazil · 2019-12-18T17:30:34Z

I'd personally start with the API that Grafana wants, and then build the simplest possible thing to provide that. #6309 (comment) sounds about right to me at first glance.

For an MVP implementation I'd suggest having them tied to targets similarly to how we do metadata, and brute forcing the lookup from there. We won't really know what Grafana wants until we can get an API they can play with, so I'd go for throwing something together over lots of up-front design.

brian-brazil · 2020-08-24T14:39:21Z

@cstyan Where are we with this review?

cstyan · 2020-08-27T17:57:27Z

@brian-brazil The interface here doesn't really prevent us from implementing the query API that was discussed with the Grafana team, but it makes the implementation of the API not as nice. I think it also limits options for those building long term storage ecosystem projects that may intend to store exemplars "forever".

I think what I've implemented in #6635 makes more sense given the end goals of how we see exemplars being useful to a system like Grafana dashboards.

codesome · 2021-03-16T10:40:10Z

Superseded by #6635

shreyassrivatsan force-pushed the ss/appender branch from 7dc1a4c to 697b354 Compare November 13, 2019 01:46

shreyassrivatsan force-pushed the ss/appender branch from 697b354 to f80da7e Compare November 13, 2019 16:29

shreyassrivatsan force-pushed the ss/appender branch from f80da7e to 6cc4185 Compare November 13, 2019 22:38

shreyassrivatsan force-pushed the ss/appender branch from 6cc4185 to be84787 Compare November 13, 2019 22:49

beorn7 self-requested a review November 14, 2019 15:20

shreyassrivatsan force-pushed the ss/appender branch from 9d49ab9 to c1a5aea Compare December 3, 2019 18:53

shreyassrivatsan changed the title ~~Pass exemplars through scrape logic~~ Exemplars store in-memory storage interface Dec 3, 2019

shreyassrivatsan force-pushed the ss/appender branch from c1a5aea to fc61fdf Compare December 3, 2019 18:54

cstyan reviewed Dec 10, 2019

View reviewed changes

shreyassrivatsan force-pushed the ss/appender branch from 4613a7f to 4c0e4fe Compare December 16, 2019 19:03

beorn7 reviewed Dec 16, 2019

View reviewed changes

shreyassrivatsan added 4 commits December 18, 2019 08:29

Add exemplars interface

562c415

Signed-off-by: Shreyas Srivatsan <[email protected]>

Add newline at end of file

ca55365

Signed-off-by: Shreyas Srivatsan <[email protected]>

Allow partial matches

c56e821

Signed-off-by: Shreyas Srivatsan <[email protected]>

Change based on feedback

e4cec7e

Signed-off-by: Shreyas Srivatsan <[email protected]>

shreyassrivatsan force-pushed the ss/appender branch from 44365c8 to e4cec7e Compare December 18, 2019 16:29

stale bot added the stale label Feb 19, 2020

shreyassrivatsan mentioned this pull request Feb 19, 2020

Add support for exemplars in the scrape appender #6844

Closed

stale bot removed the stale label Aug 24, 2020

stale bot added the stale label Oct 26, 2020

Base automatically changed from master to main February 23, 2021 19:36

stale bot removed the stale label Feb 23, 2021

codesome closed this Mar 16, 2021

Conversation

shreyassrivatsan commented Nov 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-brazil commented Nov 13, 2019

Uh oh!

juliusv commented Nov 13, 2019

Uh oh!

shreyassrivatsan commented Nov 13, 2019

Uh oh!

brian-brazil commented Nov 13, 2019

Uh oh!

shreyassrivatsan commented Nov 13, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-brazil commented Nov 13, 2019

Uh oh!

robskillington commented Nov 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

brian-brazil commented Nov 14, 2019

Uh oh!

beorn7 commented Nov 25, 2019

Uh oh!

beorn7 commented Nov 25, 2019

Uh oh!

shreyassrivatsan commented Dec 3, 2019

Uh oh!

cstyan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

brian-brazil commented Dec 10, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

beorn7 commented Dec 18, 2019

Uh oh!

brian-brazil commented Dec 18, 2019

Uh oh!

brian-brazil commented Aug 24, 2020

Uh oh!

cstyan commented Aug 27, 2020

Uh oh!

codesome commented Mar 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

shreyassrivatsan commented Nov 13, 2019 •

edited

Loading

shreyassrivatsan commented Nov 13, 2019 •

edited

Loading

robskillington commented Nov 14, 2019 •

edited

Loading