OTLP to directly write to an interface which can hide storage details#16951
OTLP to directly write to an interface which can hide storage details#16951
Conversation
Signed-off-by: David Ashpole <[email protected]>
Signed-off-by: David Ashpole <[email protected]>
Signed-off-by: György Krajcsovits <[email protected]>
to also check metadata. Signed-off-by: György Krajcsovits <[email protected]>
also fix inconsistent classic histogram. Count was lower than sum of bucket values. Signed-off-by: György Krajcsovits <[email protected]>
Defer converting to whatever actual model/labels representation is used by storage. Prometheus will use whatever is set as build label. Downstream might keep slices for example when converting to something like remote write 1. Signed-off-by: György Krajcsovits <[email protected]>
85ac11c to
1c029e4
Compare
|
Benchmark at main(e35c09d) vs pr (4b89cd5) |
storage/remote/otlptranslator/prometheusremotewrite/metrics_to_prw_test.go
Outdated
Show resolved
Hide resolved
# Conflicts: # storage/remote/otlptranslator/prometheusremotewrite/helper.go
…ints We check buckets in the similar test for histograms. Signed-off-by: György Krajcsovits <[email protected]>
storage/remote/otlptranslator/prometheusremotewrite/combined_appender.go
Show resolved
Hide resolved
storage/remote/otlptranslator/prometheusremotewrite/combined_appender.go
Outdated
Show resolved
Hide resolved
Prepare for the rewrite of the OTLP translator upstream in prometheus/prometheus#16951 by adding an end-to-end test to check for regressions. Related to #12152. Also switch from raw JSON in the test to building the payload in code and marshaling to protobuf. Signed-off-by: György Krajcsovits <[email protected]>
aknuds1
left a comment
There was a problem hiding this comment.
LGTM AFAICT at this point. I trust you've made sure it integrates well with Mimir :)
To be more inline with future direction in #17104 Also use some helper variables in appender tests to make it more readable in shorter. Signed-off-by: György Krajcsovits <[email protected]>
Thank you, I'll update the Mimir PR on Monday and test one more time. @bwplotka Can I get an approve from you as well? I'd like to merge this. We can later switch to whatever we agree on in #17104. But I don't want to be blocked on that and I've made the interface here as close to that as possible for now. |
To get prometheus/prometheus#16951 at b03fd1e7 Signed-off-by: György Krajcsovits <[email protected]>
Test build in: grafana/mimir#12652 |
I've checked out Mimir and ran I see data (floats, native histograms) and metadata as well. Resolving comments and merging. |
To get prometheus/prometheus#16951 at b03fd1e7 Signed-off-by: György Krajcsovits <[email protected]>
To get prometheus/prometheus#16951 Signed-off-by: György Krajcsovits <[email protected]>
Prepare for the rewrite of the OTLP translator upstream in prometheus/prometheus#16951 by adding an end-to-end test to check for regressions. Related to #12152. Also switch from raw JSON in the test to building the payload in code and marshaling to protobuf. Signed-off-by: György Krajcsovits <[email protected]>
To get prometheus/prometheus#16951 Signed-off-by: György Krajcsovits <[email protected]>
Prepare for the rewrite of the OTLP translator upstream in prometheus/prometheus#16951 by adding an end-to-end test to check for regressions. Related to #12152. Also switch from raw JSON in the test to building the payload in code and marshaling to protobuf. Signed-off-by: György Krajcsovits <[email protected]>
…#12652) This PR updates mimir-prometheus to [`c2f03d6d14d2`.](grafana/mimir-prometheus@c2f03d6) Brings in prometheus/prometheus#16951. Changelog: OTLP: native support for OpenTelemetry metric start time to Prometheus metric created timestamp conversion, instead of converting to QuietZeroNaNs introduced in #10238. The configuration parameter `-distributor.otel-start-time-quiet-zero` is therefore deprecated and will be removed. Now supports start time for exponential histograms. This is a major rewrite of the endpoint in upstream Prometheus and Mimir. #12652 Implementation does away with the generated otlp translation code as we don't need to rewrite it for `mimirpb` anymore. The storage logic is behind an interface. Signed-off-by: György Krajcsovits <[email protected]> Co-authored-by: Arve Knudsen <[email protected]>
…prometheus#16951) * OTLP writer writes directly to appender Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead. For downstream projects that still convert OTLP to something else (e.g. Mimir using its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the implementation. Name is subject to change. --------- Signed-off-by: David Ashpole <[email protected]> Signed-off-by: György Krajcsovits <[email protected]> Signed-off-by: George Krajcsovits <[email protected]> Co-authored-by: David Ashpole <[email protected]> Co-authored-by: Jesus Vazquez <[email protected]> Co-authored-by: Arve Knudsen <[email protected]>
Signed-off-by: Naman-B-Parlecha <[email protected]> feat(histogram): Refactor ConvertNHCBToClassicHistogram and add unit tests Signed-off-by: Naman-B-Parlecha <[email protected]> feat(histogram): Add ConvertNHCBToClassic flag and refactor related types and tests Signed-off-by: Naman-B-Parlecha <[email protected]> Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 (prometheus#17151) * Replace gopkg.in/yaml.v2 with go.yaml.in/yaml/v2 * Upgrade to [email protected] --------- Signed-off-by: Arve Knudsen <[email protected]> OTLP to directly write to an interface which can hide storage details (prometheus#16951) * OTLP writer writes directly to appender Do not convert to Remote-Write 1.0 protocol. Convert to TSDB Appender interface instead. For downstream projects that still convert OTLP to something else (e.g. Mimir using its own RW 1.0+2.0 compatible protocol), introduce a compatibility layer between OTLP decoding and TSDB Appender. This is the CombinedAppender that hides the implementation. Name is subject to change. --------- Signed-off-by: David Ashpole <[email protected]> Signed-off-by: György Krajcsovits <[email protected]> Signed-off-by: George Krajcsovits <[email protected]> Co-authored-by: David Ashpole <[email protected]> Co-authored-by: Jesus Vazquez <[email protected]> Co-authored-by: Arve Knudsen <[email protected]> fix(textparse/protobuf): metric family name corrupted by NHCB parser (prometheus#17156) * fix(textparse): implement NHCB parsing in ProtoBuf parser directly The NHCB conversion does some validation, but we can only return error from Parser.Next() not Parser.Histogram(). So the conversion needs to happen in Next(). There are 2 cases: 1. "always_scrape_classic_histograms" is enabled, in which case we convert after returning the classic series. This is to be consistent with the PromParser text parser, which collects NHCB while spitting out classic series; then returns the NHCB. 2. "always_scrape_classic_histograms" is disabled. In which case we never return the classic series. Signed-off-by: György Krajcsovits <[email protected]> * refactor(textparse): skip classic series instead of adding NHCB around Do not return the first classic series from the EntryType state, switch to EntrySeries. This means we need to start the histogram field state from -3 , not -2. In EntrySeries, skip classic series if needed. Signed-off-by: György Krajcsovits <[email protected]> * reuse nhcb converter Signed-off-by: György Krajcsovits <[email protected]> * test(textparse/nhcb): test corrupting metric family name NHCB parse doesn't always copy the metric name from the underlying parser. When called via HELP, UNIT, the string is directly referenced which means that the read-ahead of NHCB can corrupt it. Signed-off-by: György Krajcsovits <[email protected]> Handle error gracefully for the `desymbolizeLabels` function in prompb/io/prometheus/write/v2/symbols.go (prometheus#17160) Signed-off-by: pipiland <[email protected]> --------- Signed-off-by: pipiland <[email protected]> Co-authored-by: pipiland <[email protected]> feat(histogram): Refactor NHCB to classic conversion with enhanced validation and update tests Signed-off-by: Naman-B-Parlecha <[email protected]> feat(histogram): Update NHCB to classic conversion logic and remove tests Signed-off-by: Naman-B-Parlecha <[email protected]> refactor(histogram): Update ConvertNHCBToClassicHistogram Signed-off-by: Naman-B-Parlecha <[email protected]> test cases for new conversion Signed-off-by: Naman-B-Parlecha <[email protected]> refactor(histogram): Simplify label building in ConvertNHCBToClassicHistogram Signed-off-by: Naman-B-Parlecha <[email protected]>

Original PR: #16855
Original Title: OTLP to directly writes to storage.Appender
Main change by @dashpole . @krajorama adopting the PR while @dashpole is on vacation.
Remove intermediary OTLP -> PRW 1.0 step before writing to the appender. It also adds a CombinedAppender interface, which consolidates error handling, and makes it possible to re-use the library in Mimir or other storage backends, which currently uses it to translate to PRW 1.0.
Similar to #16827, but maintains existing code structure and unit tests.
Alternative to #16784.
Notable Changes:
otlp_without_metadata_appended_samples_totalinstead ofremote_write_without_metadata_appended_samples_total(this still exists just won't be used here).otlp_out_of_order_exemplars_totalreplaces "Error on ingesting out-of-order exemplars" warning log.appender.UpdateMetadatawith metadata. The unit is converted using theotlptranslatorlibrary.appender.AppendHistogramCTZeroSampleandappender.AppendCTZeroSamplewhen the created-timestamp-zero-ingestion feature is enabled.Benchmark summary: 36% less CPU, 38% fewer bytes, 19% fewer allocations
Which issue(s) does the PR fix:
Does this PR introduce a user-facing change?