Skip to content

storage/remote: compute highestTimestamp and dataIn at QueueManager level#17065

Merged
machine424 merged 2 commits intoprometheus:mainfrom
machine424:queuelevel
Sep 1, 2025
Merged

storage/remote: compute highestTimestamp and dataIn at QueueManager level#17065
machine424 merged 2 commits intoprometheus:mainfrom
machine424:queuelevel

Conversation

@machine424
Copy link
Copy Markdown
Member

@machine424 machine424 commented Aug 20, 2025

Because of relabelling, an endpoint can only select a subset of series that go through WriteStorage

Having a highestTimestamp at WriteStorage level only yields wrong values if the corresponding sample won't even make it to a remote queue.

Currently PrometheusRemoteWriteBehind is based on that, and would fire if an endpoint is only interested in a subset of series that take time to appear.

A "prometheus_remote_storage_queue_highest_timestamp_seconds" that only takes into account samples in the queue is introduced, and used in PrometheusRemoteWriteBehind and dashboards in documentation/prometheus-mixin

Same applies to samplesIn/dataIn, QueueManager should know more about when to update those; when data is enqueued.

That makes dataDropped unnecessary, thus help simplify the logic in QueueManager.calculateDesiredShards()


  • add tests
  • validate changes under documentation/prometheus-mixin
  • run some benchmarks

Which issue(s) does the PR fix:

Does this PR introduce a user-facing change?

[ENHANCEMENT] Remote-write: Add `prometheus_remote_storage_queue_highest_timestamp_seconds` metric, which tracks the highest timestamp that was actually enqueued per queue, accounting for relabeling e.g.
[ENHANCEMENT] Mixin: Replace `prometheus_remote_storage_highest_timestamp_in_seconds` metric with the new `prometheus_remote_storage_queue_highest_timestamp_seconds` metric in dashboards and alerts to properly account for relabeling, for better accuracy.
[CHANGE] Remote-write: Deprecate `prometheus_remote_storage_{samples,exemplars,histograms}_in_total` and `prometheus_remote_storage_highest_timestamp_in_seconds` metrics, see their respective descriptions for alternatives.

@machine424 machine424 marked this pull request as draft August 20, 2025 21:46
@machine424
Copy link
Copy Markdown
Member Author

still needs a double check and some extra tests, but wdyt @cstyan?

@machine424 machine424 changed the title WIP: storage/remote: compute highestTimestamp and dataIn at QueueManager level fix(storage/remote): compute highestTimestamp and dataIn at QueueManager level Aug 21, 2025
@machine424 machine424 changed the title fix(storage/remote): compute highestTimestamp and dataIn at QueueManager level storage/remote: compute highestTimestamp and dataIn at QueueManager level Aug 21, 2025
@machine424 machine424 marked this pull request as ready for review August 21, 2025 13:33
@cstyan
Copy link
Copy Markdown
Member

cstyan commented Aug 21, 2025

Great idea! 👍 thanks for doing this. I think there's a few things we need to verify but likely we can get rid of those other two values you've pointed out.

What we want users to be able to do is differentiate between "prometheus is scraping samples and writing them to the WAL, but the WAL reader for remote write (per queue) is stuck for some reason" vs "this queue manager is seeing samples but is deciding to drop them all". I think we can still get that information in other ways, but we should verify.

As far as the reduction in complexity for calculateDesiredShards, love it.

@machine424 machine424 marked this pull request as draft August 25, 2025 08:56
@machine424 machine424 force-pushed the queuelevel branch 2 times, most recently from 466cffc to 19fbc90 Compare August 25, 2025 11:11
@machine424
Copy link
Copy Markdown
Member Author

machine424 commented Aug 25, 2025

Thanks Callum for taking a look.
Do you have some concrete scenarios in mind for “Prometheus is scraping samples and writing them to the WAL, but the WAL reader for remote write (per queue) is stuck for some reason”?

The most likely cases I can think of are the watcher failing, or the queue being full (because pending samples can’t be sent). Both of these are already covered by existing metrics that expose such issues.

So yes, you're right, we'll kind of swap some false positives because we don't take relabeling into account for some possible false negatives, but I think the new “shadow areas” can be clarified by the existing metrics.

That said, we can keep the metrics at the storage level for now, 4 series per Prometheus instance isn’t a big deal. We can always reconsider this in the future.

@machine424 machine424 marked this pull request as ready for review August 25, 2025 11:28
@machine424
Copy link
Copy Markdown
Member Author

machine424 commented Aug 27, 2025

cc @cstyan if you can take another look.

@machine424 machine424 force-pushed the queuelevel branch 2 times, most recently from 8ae3d9d to 9779d29 Compare August 27, 2025 09:56
@cstyan
Copy link
Copy Markdown
Member

cstyan commented Aug 27, 2025

The most likely cases I can think of are the watcher failing, or the queue being full (because pending samples can’t be sent). Both of these are already covered by existing metrics that expose such issues.

Yeah I think you're right, it's just been a long time since I last looked at the data we have here in any detail.

I think the top level "most recent scrape timestamp" via prometheus_remote_storage_highest_timestamp_in_seconds and "most recent sent timestamp" via prometheus_remote_storage_queue_highest_sent_timestamp_seconds was useful to differentiate between "scraping but not sending" vs a more generic "not scraping", but in theory we do have other proxies for discovering that like total scrape metrics.

I think we also have a proxy for "sampels are being scraped and written to/read from the WAL, we're just choosing to drop all (via relabel) for queue X and that's why you're not seeing data in your remote storage" with the metrics in the watcher for records read plus the queues dropped samples metric. The dropped samples metric even has a label to differentiate between unexpectedly dropped samples vs samples dropped due to relabelling.

I think we're okay here 🤔

@machine424
Copy link
Copy Markdown
Member Author

Yes, because what gets sent is what the watcher sends to the queues; that's what we need to be monitoring.
Great, we're on the same page then.

And to be even more prudent (even though there are no stability guarantees for metrics), I'll only mark them as deprecated, propose the alternatives, and postpone the removal to another PR.

…evel

Because of relabelling, an endpoint can only select a subset of series
that go through WriteStorage

Having a highestTimestamp at WriteStorage level yields wrong values
if the corresponding sample won't even make it to a remote queue.

Currently PrometheusRemoteWriteBehind is based on that, and would fire
if an endpoint is only interested in a subset of series that take time
to appear.

A "prometheus_remote_storage_queue_highest_timestamp_seconds" that only
takes into account samples in the queue is introduced, and used in
PrometheusRemoteWriteBehind and dashboards in documentation/prometheus-mixin

Same applies to samplesIn/dataIn, QueueManager should know more about
when to update those; when data is enqueued.

That makes dataDropped unnecessary, thus help simplify the logic
in QueueManager.calculateDesiredShards()

Signed-off-by: machine424 <[email protected]>
…ams}_in_total and prometheus_remote_storage_highest_timestamp_in_seconds

Signed-off-by: machine424 <[email protected]>
@machine424
Copy link
Copy Markdown
Member Author

cc @cstyan
A lot of action on storage/remote recently, had to resolve conflicts 3 times :)

@machine424 machine424 merged commit c574303 into prometheus:main Sep 1, 2025
28 checks passed
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 2, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 2, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
openshift-merge-bot bot pushed a commit to openshift/prometheus that referenced this pull request Sep 4, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 10, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 10, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/prometheus that referenced this pull request Sep 15, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 15, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 17, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
openshift-cherrypick-robot pushed a commit to openshift-cherrypick-robot/prometheus that referenced this pull request Sep 18, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
machine424 added a commit to machine424/prometheus that referenced this pull request Sep 19, 2025
parial cherry-pick of prometheus#17065
to make it easier to backport to older versions.

the new metric is "prometheus_remote_storage_queue_highest_timestamp_seconds"
bwplotka added a commit that referenced this pull request Oct 28, 2025
This reverts commit c574303, reversing
changes made to 2cbeef6.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants