WAL watcher unable to catch up when backlogged

### What did you do?

I reported this in Slack, and while I don't have a simple reproduction, @cstyan confirmed that this looks like a bug: https://cloud-native.slack.com/archives/C167KFM6C/p1706219700271849

There's a scenario where, if the WAL watcher falls behind after startup, it becomes unable to catch up.

Specifically, the `lastSegment` variable in Run in watcher.go is never updated after the watcher starts, so if we've ever been caught up to the latest WAL segment in a Prometheus process lifetime, even if we subsequently fell behind again, the `tail` parameter will be set to true: https://github.com/prometheus/prometheus/blob/998fafe679b7e02b6a786672ade256aa714e5616/tsdb/wlog/watcher.go#L301-L303
If `tail` is true, we'll only actually perform a read from the current WAL segment when we get a Notify event for a write, or when the every-15-second read timeout fires: https://github.com/prometheus/prometheus/blob/998fafe679b7e02b6a786672ade256aa714e5616/tsdb/wlog/watcher.go#L478-L487

If Notify events are infrequent, e.g. if they occur approximately once per new WAL segment, and the rate of new samples & new segments is roughly constant, this will mean our rate of reading old records will be roughly capped at the rate of new WAL segments being written, and so we'll gradually keep falling behind. We appear to have seen this behavior in production repeatedly--specifically the symptom where, when the WAL watcher falls behind, it gets stuck well below the throughput that it's able to achieve at startup.

This is similar to https://github.com/prometheus/prometheus/issues/13111 but isn't limited to the use of `agent` mode (although it does require a more specific load pattern).

I suspect a fix might be as simple as setting `lastSegment = max(lastSegment, currentSegment)` here: https://github.com/prometheus/prometheus/blob/998fafe679b7e02b6a786672ade256aa714e5616/tsdb/wlog/watcher.go#L308-L309
... but that might have wider implications of which I'm not aware.

### What did you expect to see?

I expected that the WAL watcher used by remote write would be able to catch up from a backlog that occurs between restarts (e.g. due to a blip on the receiver side) as quickly as it is able to catch up from a backlog at startup.

### What did you see instead? Under which circumstances?

The WAL watcher throughput becomes a bottleneck on remote write and the backlog grows indefinitely.

### System information

Linux 5.15.109+ x86_64

### Prometheus version

```text
prometheus, version 2.48.0 (branch: HEAD, revision: 6d80b30990bc297d95b5c844e118c4011fad8054)
  build user:       root@26117804242c
  build date:       20231116-04:35:21
  go version:       go1.21.4
  platform:         linux/amd64
  tags:             netgo,builtinassets,stringlabels
```


### Prometheus configuration file

_No response_

### Alertmanager version

_No response_

### Alertmanager configuration file

_No response_

### Logs

_No response_

	case <-readTicker.C:
	level.Debug(w.logger).Log("msg", "Watcher is reading the WAL due to timeout, haven't received any write notifications recently", "timeout", readTimeout)
	err := w.readAndHandleError(reader, segmentNum, tail, size)
	if err != nil {
	return err
	}
	// still want to reset the ticker so we don't read too often
	readTicker.Reset(readTimeout)

	case <-w.readNotify:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WAL watcher unable to catch up when backlogged #13471

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

System information

Prometheus version

Prometheus configuration file

Alertmanager version

Alertmanager configuration file

Logs

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	if err := w.watch(currentSegment, currentSegment >= lastSegment); err != nil && !errors.Is(err, ErrIgnorable) {
	return err
	}

WAL watcher unable to catch up when backlogged #13471

Description

What did you do?

What did you expect to see?

What did you see instead? Under which circumstances?

System information

Prometheus version

Prometheus configuration file

Alertmanager version

Alertmanager configuration file

Logs

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions