remote_write: return 422 when native histograms are sent but disabled; add tests#17196
remote_write: return 422 when native histograms are sent but disabled; add tests#17196shibo911 wants to merge 2 commits intoprometheus:mainfrom
Conversation
cc2a9a5 to
64db6ad
Compare
|
|
Is this compatible with #17197 ? |
Notes:
So, aside from adding that single v1 case when #17197 merges, the behavior introduced in this PR is fully compatible. PTAL @roidelapluie |
storage/remote/write_handler.go
Outdated
| if errors.Is(err, storage.ErrNativeHistogramsDisabled) { | ||
| // Receiver supports protocol, but cannot process native histograms. | ||
| nhDisabled = true | ||
| h.logger.Error("Out of order histogram from remote write", "err", err.Error(), "series", ls.String(), "timestamp", hp.Timestamp) |
There was a problem hiding this comment.
| h.logger.Error("Out of order histogram from remote write", "err", err.Error(), "series", ls.String(), "timestamp", hp.Timestamp) | |
| h.logger.Error("Native histograms not enabled", "err", err.Error(), "series", ls.String(), "timestamp", hp.Timestamp) |
There was a problem hiding this comment.
I'm not sure we want to log this as error. The feature may be disabled on purpose, so maybe debug level is better.
There was a problem hiding this comment.
- Done. Updated the log to:
-
message: “Native histograms not enabled” -
level: debug
-
- Location: storage/remote/write_handler.go, in appendV2() under the ErrNativeHistogramsDisabled branch.
- Behavior unchanged: we still mark nhDisabled = true, include the error in badRequestErrs, and return HTTP 422 for v2 with partial-write headers (0 histograms written).
CC @krajorama
|
|
||
| switch { | ||
| case err == nil: | ||
| case errors.Is(err, storage.ErrNativeHistogramsDisabled): |
There was a problem hiding this comment.
for completeness, please add test
There was a problem hiding this comment.
-
Added a test covering the OTLP path:
- Test name: TestOTLPWriteHandler_NativeHistogramsDisabled
- File: storage/remote/write_test.go
- It builds an OTLP request with histogram data, injects ErrNativeHistogramsDisabled via the mock appendable, and asserts HTTP 422.
-
We already had coverage for RW v1 and v2:
- TestNativeHistogramsDisabled_V1Message (v1 returns 422)
- TestNativeHistogramsDisabled_V2Message (v2 returns 422, with samples/exemplars accepted and headers reflecting 0 histograms)
CC @krajorama
0b684ac to
47302d5
Compare
…; add tests (prometheus#17181) Signed-off-by: shibo911 <[email protected]>
… 422 test Signed-off-by: shibo911 <[email protected]>
47302d5 to
4a55466
Compare
bwplotka
left a comment
There was a problem hiding this comment.
Thanks!
I'd argue 422 for this case is an overkill, why not 400? Otherwise great!
| switch { | ||
| case errors.Is(err, storage.ErrNativeHistogramsDisabled): | ||
| // Receiver supports the protocol, but cannot process native histograms. | ||
| http.Error(w, err.Error(), http.StatusUnprocessableEntity) |
There was a problem hiding this comment.
422 is weird, quite rare. You could argue 422 for other cases too (e.g. too old samples, someone didn't configure out of order etc).
Why not invalid request?
There was a problem hiding this comment.
Logically, 422 makes more sense. The request is correct but can not be processed.
| return samplesWithoutMetadata, 0, nil | ||
| } | ||
| // TODO(bwplotka): Better concat formatting? Perhaps add size limit? | ||
| if nhDisabled { |
There was a problem hiding this comment.
Again, an odd snowflake IMO, just StatusBadRequest would be fine (simpler code)
|
I am fine with both 400 and 422. Semantically 422 is better but we can be consistent and just reply 400. |
I'm fine with either, but this points out that we should probably update the v2 spec to include details about what error code the receiver should return when it wants to reject data like this. We already have conditions for "receiver may reject", for example if the receiver requires that metadata be present. This is another "may reject" case IMO, and the decision about which code to return in such cases should be documented in the spec. |
|
Hello from the bug scrub! @cstyan @bwplotka @roidelapluie has there been any decision so that we can get this PR unblocked? |
|
Let's align on 400 cc @shibo911 |
|
We have other cases when something is not supported, so we would need to return 422 for those too. Then if we use 422 and make that a pattern, ideally we change the spec. So... I'd vote for 400 for now. |
Which issue(s) does the PR fix:
Does this PR introduce a user-facing change?
[BUGFIX] Remote Write: Return HTTP 422 Unprocessable Entity when native histograms are sent to a receiver with native histograms disabled. Applies to:
Previously these requests returned HTTP 500.