Skip to content

mixin: remote-write related alert severity should take HA setup into account #7176

@beorn7

Description

@beorn7

Currently, the PrometheusRemoteStorageFailures and PrometheusRemoteWriteBehind alerts are critical. However, especially with remote-write setups, many users will run HA pairs (or groups) of Prometheus servers, and the remote-write receiver will have some way of dedup'ing the incoming samples. If that's the case, just one Prometheus replica having trouble with remote-write should just be a warning. The alert should be critical only if all members of the HA group have trouble.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions