Skip to content

Fix the prometheus_tsdb_sample_ooo_delta metric #11836

@codesome

Description

@codesome

What did you do?

Run Prometheus with out-of-order support enabled

What did you expect to see?

prometheus_tsdb_sample_ooo_delta to have the right buckets filled.

What did you see instead? Under which circumstances?

prometheus_tsdb_sample_ooo_delta is initialized with buckets as seconds

prometheus/tsdb/head.go

Lines 440 to 448 in f88a0a7

Buckets: []float64{
60 * 10, // 10 min
60 * 30, // 30 min
60 * 60, // 60 min
60 * 60 * 2, // 2h
60 * 60 * 3, // 3h
60 * 60 * 6, // 6h
60 * 60 * 12, // 12h
},

But Prometheus uses milliseconds as its timestamp. So observing the ooo_delta with milliseconds will lead to wrong buckets being filled. See:

a.head.metrics.oooHistogram.Observe(float64(delta))

Until Prometheus starts using native histograms, TSDB can compromise from its timestamp unit agnostic design for this metric and do float64(delta)/1000 instead.


This issue is broken down from #11329

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions