-
Notifications
You must be signed in to change notification settings - Fork 10.3k
Exemplars store in-memory storage interface #6309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| // Copyright 2019 The Prometheus Authors | ||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||
| // you may not use this file except in compliance with the License. | ||
| // You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, software | ||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
|
|
||
| package exemplar | ||
|
|
||
| import ( | ||
| "github.com/prometheus/prometheus/pkg/labels" | ||
| ) | ||
|
|
||
| // Exemplars is an exemplar storage backend. | ||
| type Exemplars interface { | ||
| // Get is a fast lookup to find an exemplar with an exact match on labels | ||
| // and timestamp provided (so this can be O(1) lookup). | ||
| Get(l labels.Labels, t int64) (Exemplar, bool, error) | ||
|
|
||
| // GetRange is a query to find exemplars with exact matches on labels | ||
| // and timestamp ranges provided (this is a range query). Query returns | ||
| // a list of Exemplar for the label match. | ||
| GetRange(l labels.Labels, start, end int64) ([]Exemplar, error) | ||
|
|
||
| // Add an exemplar, it will be retained based on configuration at | ||
| // the construction of the exmplar storage. | ||
| Add(l labels.Labels, t int64, e Exemplar) error | ||
|
|
||
| // Close the storage. | ||
| Close() error | ||
| } | ||
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get(...)should return[]Exemplar, multiple exemplars could matchlif not many labels are passed.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the idea was it has to be a full equality match on labels. The labels here describe the exact metric including all the labels and the name, so there should only be one. Otherwise the lookup is not that useful. We need to be able to find the exemplar corresponding to the metric.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't speak for everyone else but I think we should follow the same pattern as existing queries; return all matches for the labels provided. That doesn't exclude users from passing labels that would match only a single exemplar, but is better than
error; query for exemplar would return more than one resultin the case of the current functions IMO.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.. But in that case the return value will have to change.. It will additionally have to include information of the actual series matched as the exemplar is not useful without the series itself. So returns something like
and the method becomes
Get(l labels.Labels, t int64) ([]SeriesExemplar, bool, error)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume @shreyassrivatsan 's original idea was to keep things really as simple as possible. With exact matches, we can just have all the exemplars in a simple hash map. With label matching, we'll find ourselves once more in a situation where we have to implement a database (or do full scans of the exemplars in memory). Is that part of the plan?
Could we perhaps quickly think through the typical query pattern for exemplars? Will we commonly need label matching at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a Grafana dashboard is going to be a pretty common usecase. If a dashboard panel has a single query for something like
http_request_duration_seconds, without enough labels to narrow the query down to a single series, I don't think it's reasonable for Grafana to have to do the work to split all the results into individual label sets and make a query per unique series.There's also the issue that many histogram queries aggregate away labels that we'd need to uniquely identify which exemplar to select.
@shreyassrivatsan can you elaborate on your use case and what M3 needs from the API, and hopefully we can get some more details from @davkal as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. So I did not mean that grafana makes a query and then retrieves the exemplars. I meant a prometheus query will first query the series, get the exemplars and then apply functions.. I am simplifying a lot here, but essentially
/api/v1/queryand/api/v1/query_rangewill doThe resulting response will probably contain something maybe like the below
Eventually for full exemplar support , functions would have to be updated to handle exemplars. For example, aggregation functions can have a pick one policy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach makes a lot of sense. However, it requires to intertwine the existing API with exemplar support. My impression from the Prometheus side of things was that the plan is to implement something experimental that is a completely separate API so that consumers of that API can play with it and prove or disprove its usefulness. We had a few chats now with @juliusv and @davkal, wherein we realized that it is actually very hard for a visualization frontend like Grafana to craft dedicated queries for exemplars (for example in case the user clicks on a peak of a latency graph or on a tile of a histogram heatmap).
Returning exemplars alongside normal queries, if so requested, is solving that problem, but it's quite invasive (as the exemplar support has to be piped through the whole query engine, including smart decisions what to pick for aggregations). If we went down that alley, the exact-match method as suggested here would indeed make sense as illustrated by your pseudo code. (We might want to use a time range and not just a single timestamp in the signature then.)
While I like the idea, I'm pretty sure amending everything in the query engine is not feasible for an experimental feature. I'm still thinking about ways how to do something for the MVP that leaves the current query API and engine as is…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that makes sense. And yes, this is going more towards what will be the right solution if we decide to pipe this through the query engine. I still do think the exact timestamp makes for a faster implementation as we will have access to that as a part of the results for a time series query.
For an experimental implementation using a time range and partial label match is probably a better approach, but implementing the backing store will be some more work. I made some changes to the interface.
Getstill does exact matches for a given timestamp.Querydoes partial label matches, takes a time range and returns a list ofSeriesExemplars.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea for how the (MVP) query flow and UI in Grafana could look like:
The Grafana user creating a dashboard would be required to pair each graphing query with an additional "query", which would be required to just be a PromQL label selector without any aggregations or functions, e.g.
request_latency_bucket{code=~"2..",job="foo"}. The exemplar API would take exactly that query string (i.e. a naked label matcher in PromQL syntax) and a time range to return exemplars, possibly sampled to avoid too many return values. Grafana could then use the provided string to query for exemplars upon a click event (could include thelelabel when in a histogram heatmap, but would also work without it when clicking on a latency graph). Further server-side filtering by exemplar or specific downsampling value would not be in the MVP but could be done in the browser for now.A selection of exemplars with links to logs or traces could then be shown in a pop-up, or exemplars could show up as clickable "stars" in a latency graph.
@davkal does that make sense?