Skip to content

PromQL engine: Support automatic inclusion of info metric data labels.#14634

Closed
aknuds1 wants to merge 5 commits intoprometheus:mainfrom
aknuds1:feat/info-function-mvp-automatic
Closed

PromQL engine: Support automatic inclusion of info metric data labels.#14634
aknuds1 wants to merge 5 commits intoprometheus:mainfrom
aknuds1:feat/info-function-mvp-automatic

Conversation

@aknuds1
Copy link
Copy Markdown
Contributor

@aknuds1 aknuds1 commented Aug 9, 2024

Make the PromQL engine, when configured to do so, automatically include info metric data labels when expanding time series. This is intended as experimental functionality, although I haven't yet marked it as such (the PR is WIP after all).

See proposal Simplify joins with info metrics in PromQL for background. The automation in this PR should form the basis of the proposed info function, just with configurability to execute without explicit function calls in queries. I don't know for sure if this will be desirable UX in the end, but at least it's helpful during development.

This is especially useful for OpenTelemetry, since it enables automatically adding back (select) OTel resource attributes (encoded as target_info metric labels) to metrics (cheaper than promoting them to labels at ingestion time).

BenchmarkInfoFunction shows great performance gains vs. traditional join queries against info metrics (due to only fetching info metrics with matching join labels mostly):

$ benchstat join.txt info.txt                                                                                          
goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/promql
cpu: AMD Ryzen 9 3950X 16-Core Processor            
                                                                  │   join.txt   │              info.txt              │
                                                                  │    sec/op    │   sec/op     vs base               │
InfoFunction/Joining_info_metrics_with_other_metrics_example_1-32    7.013m ± 8%   1.858m ± 2%  -73.51% (p=0.002 n=6)
InfoFunction/Joining_info_metrics_with_other_metrics_example_2-32   12.814m ± 6%   1.907m ± 3%  -85.12% (p=0.002 n=6)
geomean                                                              9.480m        1.882m       -80.14%

                                                                  │   join.txt   │              info.txt               │
                                                                  │     B/op     │     B/op      vs base               │
InfoFunction/Joining_info_metrics_with_other_metrics_example_1-32   451.6Ki ± 0%   431.2Ki ± 0%   -4.50% (p=0.002 n=6)
InfoFunction/Joining_info_metrics_with_other_metrics_example_2-32   705.3Ki ± 1%   433.5Ki ± 0%  -38.54% (p=0.002 n=6)
geomean                                                             564.3Ki        432.4Ki       -23.39%

                                                                  │  join.txt   │              info.txt              │
                                                                  │  allocs/op  │  allocs/op   vs base               │
InfoFunction/Joining_info_metrics_with_other_metrics_example_1-32   3.474k ± 0%   3.272k ± 0%   -5.81% (p=0.002 n=6)
InfoFunction/Joining_info_metrics_with_other_metrics_example_2-32   5.463k ± 0%   3.299k ± 0%  -39.62% (p=0.002 n=6)
geomean                                                             4.357k        3.285k       -24.59%

TODO:

  • Fix instant queries involving rate
  • Don't try to augment info metrics

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Aug 13, 2024

I think as a PoC, this is interesting. It would be interested to feed this to a test audience to see how this flies.

However, I have concerns if this should ever be generally accessible, even behind a feature flag. In general, introducing a whole new configuration section on the PromQL level rubs me the wrong way.

Feature flags are already putting conditions to how a given PromQL query is executed, but at least those are fairly simple binary decisions. This config is quite complex and deep, and it will essentially change how PromQL works in a fairly invasive fashion. For exploring a feature, that's fine. But I don't want to live in a world where people look at a PromQL expression, and the next question they ask is if they could see the PromQL engine config used to evaluate the expression.

I'm ~fine with configuring scrapes so that info metrics can be described there. That's just changing the input data on ingestion. (But of course, for that to work, we need to implement storing this information in the TSDB, and that's a much bigger barrier.)

@bwplotka
Copy link
Copy Markdown
Member

bwplotka commented Aug 13, 2024

Nice! I like the discussion as well.

2c:

But I don't want to live in a world where people look at a PromQL expression, and the next question they ask is if they could see the PromQL engine config used to evaluate the expression.

  • I agree in principle, but our alternative is to have that "config" shipped via input data. But wouldn't this have similar result of "give me PromQL expression and also your scrape target /metrics text,so I know what are the target labels configured there?"
  • Sounds like the long term plan is to have "configuration" being not needed as the input data would give you that (e.g. improve protocols). However, it's fine to start with hardcoded thing and perhaps some hidden config.
  • Mechanisms looks fine, but do you mean you want to change your design to attach your labels by default with info function? That might be easy to use, but likely introduce overhead and potential surprises. It feels an explicit function might be better?

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Aug 13, 2024

But wouldn't this have similar result of "give me PromQL expression and also your scrape target /metrics text,so I know what are the target labels configured there?"

I think it's a reasonable thing to say "the outcome of your PromQL expression depends mostly on what's in your TSDB". That's why I think if there is ingestion config that changes the date ultimately stored in the TSDB, that's nothing new and expected. Adding an additional config is a different thing. (Feature flags are already like that, but they are simple and inherently meant for experimental things.)

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Aug 13, 2024

Mechanisms looks fine, but do you mean you want to change your design to attach your labels by default with info function? That might be easy to use, but likely introduce overhead and potential surprises. It feels an explicit function might be better?

Yeah, I am also in the "explicit is better than implicit" camp. I imagine the desire for "auto-adding" labels from info metrics comes from one problem with the "map all the labels on ingestion" approach: With the latter, you cannot change your mind later. If you realize you want to map more labels, you cannot easily apply that retroactively to historic data. But you can always change the query-time config and use it to query historic data.

@aknuds1 aknuds1 force-pushed the feat/info-function-mvp-automatic branch 13 times, most recently from 690a113 to 7756b14 Compare August 18, 2024 14:56
@aknuds1 aknuds1 force-pushed the feat/info-function-mvp-automatic branch from 7756b14 to 61ea790 Compare August 22, 2024 11:48
@aknuds1 aknuds1 changed the title WIP: PromQL engine: Support automatic inclusion of info metric data labels. PromQL engine: Support automatic inclusion of info metric data labels. Aug 22, 2024
@aknuds1 aknuds1 force-pushed the feat/info-function-mvp-automatic branch 5 times, most recently from 578cdc1 to 0c04eb6 Compare August 23, 2024 16:16
@ArthurSens
Copy link
Copy Markdown
Member

I share Beorn's and Bartek's concern here, I don't think a query language should give different results depending on configuration. I believe the main use-case for the info() function is to unblock folks coming from OpenTelemetry background and it's already set in stone that their info metric is target_info. It's fine to have this hardcoded.

I understand that this would unblock with other info metrics, mostly coming from kube-state-metrics, but I'd love to see this being solved with an approach that doesn't require query language configuration. Bonus points if we don't even need the info() metric at all, e.g. storing info as metadata and make metadata part of query results 😬

@aknuds1 aknuds1 force-pushed the feat/info-function-mvp-automatic branch 2 times, most recently from 8f0ee42 to 2602d4c Compare August 26, 2024 09:12
@aknuds1
Copy link
Copy Markdown
Contributor Author

aknuds1 commented Aug 26, 2024

I imagine the desire for "auto-adding" labels from info metrics comes from one problem with the "map all the labels on ingestion" approach: With the latter, you cannot change your mind later. If you realize you want to map more labels, you cannot easily apply that retroactively to historic data.

@beorn7 The second problem that arises is that when changing which OTel resource attributes to promote to labels, you also have a temporary doubling in series (i.e. old label set + new label set).

@aknuds1
Copy link
Copy Markdown
Contributor Author

aknuds1 commented Aug 26, 2024

I understand that this would unblock with other info metrics, mostly coming from kube-state-metric.

@ArthurSens actually, the point of this approach is rather the different UX of not having to call info in queries. It's been a user feedback, that they don't want to have to change their queries. Potentially, this PR could also be changed to only support target_info with instance/job as the identifying labels. The added flexibility through configuration isn't really the point.

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Aug 27, 2024

the point of this approach is rather the different UX of not having to call info in queries. It's been a user feedback, that they don't want to have to change their queries.

I think that's the core of what we are discussing here. Some users want to "automatically" change the result of their queries without actually changing the queries. This rubs some of us the wrong way, because now we make the outcome of a query depend on some external setting.

I think it is fine to "preprocess" the query in a wrapper, separate from core Prometheus. I assume the kind of users requiring "change the result of my query without changing the query" will work through some kind of UI that could do that for them.

To solve this within PromQL, maybe it could be piggybacked into the vaguely planned with feature, which currently is meant to just add additional label matchers to all selectors in their scope. But now we are talking about combining two not-yet-implemented features (info function and with), so maybe let's first see how the info function flies.

@aknuds1
Copy link
Copy Markdown
Contributor Author

aknuds1 commented Aug 27, 2024

To solve this within PromQL, maybe it could be piggybacked into the vaguely planned with feature

Thanks @beorn7! I had never heard of with.

@beorn7
Copy link
Copy Markdown
Member

beorn7 commented Aug 27, 2024

It is very vaguely planned. Hidden in a dev summit note from ages ago…

https://docs.google.com/document/d/11LC3wJcVk00l8w5P3oLQ-m3Y37iom6INAMEu2ZAGIIE/edit?pli=1#bookmark=id.m8eq3rv3yopu

aknuds1 and others added 5 commits August 29, 2024 17:34
This MVP of info simplifies the implementation by assuming:

* Only support for the target_info metric
* That target_info's identifying labels are job and instance

Signed-off-by: Arve Knudsen <[email protected]>
Co-authored-by: Augustin Husson <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
This MVP of info simplifies the implementation by assuming:

* Only support for the target_info metric
* That target_info's identifying labels are job and instance

Signed-off-by: Arve Knudsen <[email protected]>
@aknuds1
Copy link
Copy Markdown
Contributor Author

aknuds1 commented Oct 16, 2024

Closing PR, as it's not what the team would like to proceed with :)

@aknuds1 aknuds1 closed this Oct 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants