Skip to content

storage/remote: remote write 2.0 content negotiation#13498

Merged
alexgreenbank merged 11 commits intoremote-write-2.0from
alexg-content-negotiation-2
Apr 4, 2024
Merged

storage/remote: remote write 2.0 content negotiation#13498
alexgreenbank merged 11 commits intoremote-write-2.0from
alexg-content-negotiation-2

Conversation

@alexgreenbank
Copy link
Copy Markdown
Member

@alexgreenbank alexgreenbank commented Jan 30, 2024

Overview:

This PR adds Content Negotiation to the Remote Write 2.0 protocol. This allows a sending Prometheus server to use the newer version of the protocol if the receiving server supports it. It also implements the handling in the receiving server side. It builds on top of the WAL/metadata watcher work.

Spec: Draft Spec Link

Summary of changes:

  • The interface of queue_manager.WriteClient has been changed to allow the client side (which communicates with the receiving server) to know which format the data it is being passed is being sent.
  • We now provide the client side with the format (remote write 0.1.0 or remote write 2.0) and desired encoding (e.g. snappy, etc)
  • The HTTP client side also reports back any HTTP 400 or 406 errors so that the sending side can handle these and choose to re-encode a payload as a different version or encoding.
  • The most recent value of the X-Prometheus-Remote-Write-Version header is now exposed
  • TODO - Re-review this PR to improve summary of changes

Work remaining:

  • Learn to use git commit -s properly - rebase branch to satisfy DCO
  • Clean up lint errors
  • Re-review this draft PR and update summary of changes above
  • Review test coverage sending side
  • Review test coverage receiving side
  • Add test to cover new api endpoint
  • Update spec with details of statelessness and how a two-phased deployment can be used to both roll out remote-write 2.0 support and roll it back.
  • Revisit/discuss retry logic in queue_manager.go (see existing TODOs there)
  • Check whether hardcoded use of snappy compression in sendMetadataWithBackoff() can be made cleaner
  • Work through remaining TODOs
  • E2E testing improvements will be moved to a separate issue

@alexgreenbank
Copy link
Copy Markdown
Member Author

Moved lots of code around to keep boundaries of store client (in storage/remote/client.go) and the queue manager side. The store client will simply return a suitable error representing the 406 if received - this needs to be treated as a non-recoverable error (so no retry attempt is made) and then the queue manager needs to marshal the data in a supported protocol/compression combination.

Started to document the differences between the 1.x responses and the 2.x responses here: https://docs.google.com/document/d/1t6BjIM3ha9V_ZqeR01y2s6ebMTCW1hrJtvt7E0GeyTw/edit - this needs more work to ensure all edge cases are covered.

@alexgreenbank
Copy link
Copy Markdown
Member Author

(Will fix merge conflicts manually as some rework is required...)

Signed-off-by: Alex Greenbank <[email protected]>
@alexgreenbank alexgreenbank force-pushed the alexg-content-negotiation-2 branch from 35ac64f to 04929dc Compare March 19, 2024 17:16
@alexgreenbank
Copy link
Copy Markdown
Member Author

Force pushed a single clean commit of my changes that had proper signoff.

Alex Greenbank added 3 commits March 19, 2024 18:06
Signed-off-by: Alex Greenbank <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
@alexgreenbank
Copy link
Copy Markdown
Member Author

OK, still have the linting errors to fix but that I'll work on that separately. It shouldn't stop the initial reviews and subsequent comments!

@alexgreenbank alexgreenbank marked this pull request as ready for review March 19, 2024 19:31
@alexgreenbank alexgreenbank changed the title storage/remote: remote write 2.0 content negotiation draft storage/remote: remote write 2.0 content negotiation Mar 20, 2024
Alex Greenbank added 2 commits March 20, 2024 16:14
Signed-off-by: Alex Greenbank <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
// TODO(alexg) Do we want to include the first line of the message?
return ErrStatusBadRequest
case 406:
// Return an unrecoverable error to indicate the 406
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

406 is recoverable

Comment on lines +1629 to +1642
// Resend logic on 406
// ErrStatusNotAcceptable is a new error defined in client.go

// Work out what version to send based on the last header seen and the QM's rwFormat setting
// TODO(alexg) - see comments below about retry/renegotiate design
for attemptNos := 1; attemptNos <= 3; attemptNos++ {
lastHeaderSeen := s.qm.storeClient.GetLastRWHeader()
compression, rwFormat := negotiateRWProto(s.qm.rwFormat, lastHeaderSeen)
sendErr := attemptBatchSend(batch, rwFormat, compression, false)
if sendErr == nil || !(errors.Is(sendErr, ErrStatusNotAcceptable) || errors.Is(sendErr, ErrStatusBadRequest)) {
// It wasn't an error, or wasn't a 406 or 400 so we can just stop trying
break
}
// If we get either of the two errors (406, 400) we loop and re-negotiate
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should have a separate retry logic for this. The flip/flopping of the accepted protocol versions should be treated the same as any other recoverable error IMO. Especially since we intend to include the "accepted" protocol/encodings in the 406 response.

406 is recoverable so we should continue to retry as usual, avoiding dropping data as much as we can.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should just do the song and dance to switch to the client a different protocol verison/compression encoding right away if we get a 406, and return the recoverable error and allow the retry to happen as usual, OR if we notice we can't switch to something the receiver likes, we should return a non-recoverable error.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @cstyan here. And also two thoughts:

  • IMO what you described in the Rollout/rollback strategy section is good enough for this. I understand your point about being more robust in front of buggy servers, but there's a ton of bugs a server could have and I don't think this one is a particularly important to receive special handling.
  • Even if there are buggy servers, we expect this would only happen during rollouts, under some circumstances, and with enough bad luck. The flip/flopping should stop happening once the rollout finishes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And following this same line: negotiateRWProto could return an error if parties could not agree on a version, and treat that as unrecoverable.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there's two kinds of "recoverable":

  • A 5xx response is considered "recoverable" in the current Prometheus codebase sense as the retry function will resend the same payload after a short delay, no concept of any repackaging/re-encoding
  • A 406 response (which we deem "recoverable" in the spec) may require creating a new payload (e.g. v1 instead of v2 or vice versa, and/or a different encoding) which is beyond what the existing attemptStore function does.

This is why I described the 406 as "non-recoverable" as it wasn't "recoverable" in the same way as a 5xx error.

The main problems I found with trying to make use of the same logic as the attemptSend() method are:

  • it would require passing in a whole load of parameters in order to have the information there to rebuild the payload
  • the two payload creation functions take slightly different sets of parameters
    so I opted to keep it higher in the stack.

I'll go back and take a look at this bearing in mind the comments above and see what could be done.

Comment on lines +2206 to +2216
var compressed []byte

switch compression {
case "snappy":
compressed = snappy.Encode(*buf, data)
if n := snappy.MaxEncodedLen(len(data)); n > len(*buf) {
// grow the buffer for the next time
*buf = make([]byte, n)
}
default:
return nil, highest, lowest, fmt.Errorf("Unknown compression scheme [%s]", compression)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could refactor this out into a function now

Comment on lines +207 to +208
// Attempt a HEAD request against a remote write endpoint to see what it supports.
func (c *Client) GetProtoVersions(ctx context.Context) (string, error) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should use a more descriptive name here based on the function actually doing the HEAD request, and maybe we only need the error return value?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

  • updateProtoVersions
  • probeServerForVersions
  • determineServerVersions
    ?

Also, would it make sense to make it private?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I shied away from sticking HEAD in the name as I didn't want to include implementation details. Will make it private, pick a better name and remove the return values as nothing actually uses them.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, renamed as probeRemoteVersions() and only returning the error string.

Copy link
Copy Markdown
Contributor

@npazosmendez npazosmendez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looking nice, left a few comments

remoteWrite20HeadRequests: prometheus.NewCounter(prometheus.CounterOpts{
Namespace: "prometheus",
Subsystem: "api",
Name: "remote_write_20_head_requests",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should end this in _total

Suggested change
Name: "remote_write_20_head_requests",
Name: "remote_write_20_head_requests_total",

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, we could also drop _20 and use label once the version distinction will be needed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #13921

Comment on lines +207 to +208
// Attempt a HEAD request against a remote write endpoint to see what it supports.
func (c *Client) GetProtoVersions(ctx context.Context) (string, error) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about

  • updateProtoVersions
  • probeServerForVersions
  • determineServerVersions
    ?

Also, would it make sense to make it private?

Comment on lines +1629 to +1642
// Resend logic on 406
// ErrStatusNotAcceptable is a new error defined in client.go

// Work out what version to send based on the last header seen and the QM's rwFormat setting
// TODO(alexg) - see comments below about retry/renegotiate design
for attemptNos := 1; attemptNos <= 3; attemptNos++ {
lastHeaderSeen := s.qm.storeClient.GetLastRWHeader()
compression, rwFormat := negotiateRWProto(s.qm.rwFormat, lastHeaderSeen)
sendErr := attemptBatchSend(batch, rwFormat, compression, false)
if sendErr == nil || !(errors.Is(sendErr, ErrStatusNotAcceptable) || errors.Is(sendErr, ErrStatusBadRequest)) {
// It wasn't an error, or wasn't a 406 or 400 so we can just stop trying
break
}
// If we get either of the two errors (406, 400) we loop and re-negotiate
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @cstyan here. And also two thoughts:

  • IMO what you described in the Rollout/rollback strategy section is good enough for this. I understand your point about being more robust in front of buggy servers, but there's a ton of bugs a server could have and I don't think this one is a particularly important to receive special handling.
  • Even if there are buggy servers, we expect this would only happen during rollouts, under some circumstances, and with enough bad luck. The flip/flopping should stop happening once the rollout finishes

// Attempt a HEAD request against a remote write endpoint to see what it supports.
func (c *Client) GetProtoVersions(ctx context.Context) (string, error) {
// If we are in Version1 mode then don't even bother
if c.rwFormat == Version1 {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the Client's field c.rwFormat used anywhere apart from here? I find it confusing that the configured version is stored inside the Client (and used only here) and also in the queue manager (used it for content negotiation).

If c.rwFormat is only used here, I would remove the field and let this probe happen regardless of the configured version.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, that would make it simpler. The whole point of a lot of my refactoring was to remove the need for the client have its own rwFormat attribute!

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed rwFormat from the client.

Comment on lines +1629 to +1642
// Resend logic on 406
// ErrStatusNotAcceptable is a new error defined in client.go

// Work out what version to send based on the last header seen and the QM's rwFormat setting
// TODO(alexg) - see comments below about retry/renegotiate design
for attemptNos := 1; attemptNos <= 3; attemptNos++ {
lastHeaderSeen := s.qm.storeClient.GetLastRWHeader()
compression, rwFormat := negotiateRWProto(s.qm.rwFormat, lastHeaderSeen)
sendErr := attemptBatchSend(batch, rwFormat, compression, false)
if sendErr == nil || !(errors.Is(sendErr, ErrStatusNotAcceptable) || errors.Is(sendErr, ErrStatusBadRequest)) {
// It wasn't an error, or wasn't a 406 or 400 so we can just stop trying
break
}
// If we get either of the two errors (406, 400) we loop and re-negotiate
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And following this same line: negotiateRWProto could return an error if parties could not agree on a version, and treat that as unrecoverable.

Copy link
Copy Markdown
Member

@bwplotka bwplotka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, looks generally great from first review iteration - some style nits and suggestions


var ErrStatusBadRequest = errors.New("HTTP StatusBadRequest") // 400

var ErrStatusNotAcceptable = errors.New("HTTP StatusNotAcceptable") // 406
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be epic to add quick comment on both that explains semantics of those?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments added.


httpResp, err := c.Client.Do(httpReq.WithContext(ctx))
if err != nil {
// We don't attempt a retry here
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean? That we retry somewhere else? We never retry? Let's be specific (or drop the comment, it might not help us). Let's drop it I would say (:

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now we're actually not doing anything with the return values from this function. We should IMO. If the user has configured their prometheus to use rw2.0 but the reciever doesn't respond correctly we should likely fail fast and have prometheus exit, or at least include a rw config option that allows them to choose to fallback to 1.0.

return "", err
}

// See if we got a header anyway
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to be picky, but all those comments would be better if they have a trailing dot 🙈

Suggested change
// See if we got a header anyway
// See if we got a header anyway.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think linting should actually fail here 🤔

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All comments were updated to fit existing style.

case 400:
// Return an unrecoverable error to indicate the 400
// This then gets passed up the chain so we can react to it properly
// TODO(alexg) Do we want to include the first line of the message?
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, would be nice

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we could employ more structured response with what series exactly failed.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #13921

remoteWrite20HeadRequests: prometheus.NewCounter(prometheus.CounterOpts{
Namespace: "prometheus",
Subsystem: "api",
Name: "remote_write_20_head_requests",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, we could also drop _20 and use label once the version distinction will be needed.

h := &writeHeadHandler{
logger: logger,
rwFormat: rwFormat,
remoteWrite20HeadRequests: prometheus.NewCounter(prometheus.CounterOpts{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
remoteWrite20HeadRequests: prometheus.NewCounter(prometheus.CounterOpts{
remoteWrite20HeadRequests: promauto.With(reg).NewCounter(prometheus.CounterOpts{

then you don't need must register below.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets avoid using promauto unless we have a reason to use it over passing a registery

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in #13921

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with using promauto and the default registry. In fact, we should be using it by default.

Where is this "avoid promauto" recommendation coming from?

}

func (h *writeHeadHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) {
// Send a response to the HEAD request based on the format supported
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be comment for ServeHTTP instead (on top not below)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, fixing in next PR: #13921

Alex Greenbank and others added 5 commits April 2, 2024 14:51
Signed-off-by: Alex Greenbank <[email protected]>
Co-authored-by: Bartlomiej Plotka <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
Signed-off-by: Alex Greenbank <[email protected]>
@alexgreenbank alexgreenbank merged commit ad77987 into remote-write-2.0 Apr 4, 2024
@alexgreenbank alexgreenbank deleted the alexg-content-negotiation-2 branch April 4, 2024 14:16
@alexgreenbank
Copy link
Copy Markdown
Member Author

Gah. Keyboard focus on just the wrong button when hitting return...

@alexgreenbank alexgreenbank restored the alexg-content-negotiation-2 branch April 4, 2024 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants