Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTLP receiver: Allow colons in non-standard units #15710

Merged
merged 1 commit into from
Dec 23, 2024

Conversation

aknuds1
Copy link
Contributor

@aknuds1 aknuds1 commented Dec 21, 2024

Modify the classical mode OTel metric name translation logic so colons are allowed in non-standard units. On inspecting the code, I noticed that a mistake had been made previously in not allowing this character (i.e. the label name character set was applied, not the metric name character set).

Corresponding tests are added, plus one for the case where the OTel "per" unit is completely invalid.

@aknuds1 aknuds1 marked this pull request as ready for review December 21, 2024 17:36
@aknuds1 aknuds1 requested a review from ArthurSens December 21, 2024 17:51
Copy link
Member

@ArthurSens ArthurSens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ArthurSens
Copy link
Member

ArthurSens commented Dec 23, 2024

For another PR: Have we tried using something different than a regex to match allowed characters? Could something like this be more efficient?

if r >= 'a' || r <= 'z' || r >= 'A' || r <= 'Z' || r >= '0' || r <= '9' || r == ':' { ....

@ArthurSens ArthurSens merged commit 2ffaff8 into prometheus:main Dec 23, 2024
26 checks passed
@aknuds1 aknuds1 deleted the arve/otel-units branch December 23, 2024 20:32
@aknuds1
Copy link
Contributor Author

aknuds1 commented Dec 23, 2024

@ArthurSens I just know that Prometheus uses regexps to validate metric/label names. I suppose you could eventually show through a benchmark that it would be significantly faster through some alternative implementation.

@@ -120,18 +119,19 @@ func BuildCompliantName(metric pmetric.Metric, namespace string, addMetricSuffix
return metricName
}

var nonMetricNameCharRE = regexp.MustCompile(`[^a-zA-Z0-9:]`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting in passing: this (pulling the initialization out to start of program) should also be done for the other two regexp.MustCompile in this file. Check that relevant benchmarks exist, since they ought to be showing a performance hit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip @bboreham.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the changes, and am benchmarking to see if an effect can be seen.

Copy link
Contributor Author

@aknuds1 aknuds1 Dec 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened a PR with your suggested optimization @bboreham - simple, but effective.

@bboreham
Copy link
Member

Prometheus uses regexps to validate metric/label names.

Where? Not here.

@aknuds1
Copy link
Contributor Author

aknuds1 commented Dec 30, 2024

Where? Not here.

@bboreham nice find, I probably forgot that IsValidMetricName doesn't use actually use MetricNameRE, but SanitizeLabelName does on the other hand use a regexp.

@ArthurSens I guess the fact that IsValidMetricName is faster without a regexp indicates that performance could be improved by also not using regexps in the OTLP translation. Feel free to open a PR with benchmark numbers :)

@bboreham
Copy link
Member

bboreham commented Jan 1, 2025

SanitizeLabelName does on the other hand use a regexp.

True. SanitizeLabelName was only used in a few places in Service Discovery, prior to being taken up by the OpenTelemetry translator.

#11936 added SanitizeFullLabelName which does not use a regex. It has the unexpected (to me) behaviour of replacing a leading digit with a leading underscore.

@ArthurSens
Copy link
Member

ArthurSens commented Jan 3, 2025

@ArthurSens I guess the fact that IsValidMetricName is faster without a regexp indicates that performance could be improved by also not using regexps in the OTLP translation. Feel free to open a PR with benchmark numbers :)

Will do! Hopefully next week

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants