Skip to content

Native Summaries #16949

@bwplotka

Description

@bwplotka

Proposal

This requires a bigger proposal, MVP and motivation, but I wanted to officially start early discussions and potential work here.

The idea is to introduce the native summaries -- the replacement for "classic summaries" we have now that is build from multiple counter series. Native summaries would be contained as a single series that's more efficient and transactional.

PromQL could be adapted in similar fashion to native histogram PromQL syntax for consistency (also reusing the https://prometheus.io/docs/prometheus/latest/querying/functions/#histogram_count-and-histogram_sum functions). However due to lower, mostly historic use of summaries in the ecosystem (see Considerations), perhaps it would be easier and sufficient to only emulate classic view while storing native summaries under the hood (similar to #16948 idea).

Motivation

Native histograms were created for many reasons, but one of them was storage efficiency and transactionality. With native histogram representation, you don't have potentially 30 series for one histogram, but one which makes for incredible benefits around indexing and storage, despite more beefier sample size (float vs struct).

As per transactionality, having one series and not 30 also gives guarantee that all parts of the Prometheus (and ecosystem) will see each part of histogram (buckets, sum, count) exactly at once. This is especially important for distributed systems with remote write and sharding, as well as querying vs scraping drift etc (when no isolation is possible e.g. on Thanos).

The same efficiency and transactionality problem exists for classic summaries as well, solved by adding the native summaries.

Considerations

  • Summaries are still used widely, although less than other types. However we tend to NOT recommend them in practice, especially with the new improved native histograms. However they do still exists and are not deprecated (and there is not plan for that as of now).
  • Native histograms brought sparseness and exponential bucketing; there is no such dimension in native summaries planned -- which makes them easier to implement.

WDYT @krajorama @beorn7 @bboreham @roidelapluie @RichiH

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions