UTF-8: Add partial support for parsing UTF-8 metric and label names#13271
UTF-8: Add partial support for parsing UTF-8 metric and label names#13271beorn7 merged 1 commit intoprometheus:mainfrom
Conversation
beorn7
left a comment
There was a problem hiding this comment.
Generally looks good. Mostly nitpicking in the comment and requesting more test cases.
I don't really feel I'm the right person to review this because I am not familiar at all with the generated parser. I might also ask dumb questions in my comments.
Let's try to find a reviewer that knows those things better.
There was a problem hiding this comment.
Warning: I'm not familiar with the lexer syntax. I'll just rubberstamp this one if the results look OK to me. Experts, please chime in in time to stop me.
| {"http.status",q="0.9",a="b"} 8.3835e-05 | ||
| {"http.status",q="0.9",a="b"} 8.3835e-05 | ||
| {q="0.9","http.status",a="b"} 8.3835e-05 | ||
| {"go.gc_duration_seconds_sum"} 0.004304266` |
There was a problem hiding this comment.
Maybe add a few test cases with "weirder" characters:
- Multibyte characters like "ö" or "€".
- Whitespace in the name.
- An escaped
"in the name. {and}as part of the name.
Also, what about quoted label names? Including label names with all the weird characters above.
There was a problem hiding this comment.
oo, found some bugs, thanks :)
There was a problem hiding this comment.
Same flag here as for the OpenMetrics lexer file: I don't really know the lexer syntax.
model/textparse/openmetricsparse.go
Outdated
| return val, nil | ||
| } | ||
|
|
||
| func (p *OpenMetricsParser) verifyMetricName(start, end int) error { |
There was a problem hiding this comment.
Same question here as below for the corresponding method in promparse.go.
There was a problem hiding this comment.
Apparently you removed verifyMetricName from promparse.go, but left it in here. But this should go, too, shouldn't it?
|
I believe the Go tests should pass if you run |
fixed |
|
yes, another parser that will have to be fixed is the typescript one -- there is a task for that but I will create a proper issue to raise visibilitiy |
| label_matcher : IDENTIFIER match_op STRING | ||
| { $$ = yylex.(*parser).newLabelMatcher($1, $2, $3); } | ||
| { $$ = yylex.(*parser).newLabelMatcher($1, $2, $3); } | ||
| | string_identifier match_op STRING |
There was a problem hiding this comment.
do we need string_identifier match_op error rule? or is the parser error useful without it already?
There was a problem hiding this comment.
would that be like
{"x"=} ?
There was a problem hiding this comment.
ok yes, adding that makes the error a little better
|
test failure is I think unrelated but I'll dig in |
|
Hm this test failure is a bad sign -- there have been multiple places where people are testing the value of the negotiated format against the string constants. This is Bad because there are now many different ways the formats can look and be equivalent. How do we prevent people from doing these bad string comparisons? make the string constants non-exported? |
|
fuzz error looks concerning, I'm trying to figure out how to run that test myself |
|
I was able to debug the fuzz error, fixing |
|
got it |
beorn7
left a comment
There was a problem hiding this comment.
Looks good now with fixed fuzzing and extended tests, as far as I can say.
See comments for minor things, otherwise I'd say this is good to go. Last chance for others to chime in!
go.mod
Outdated
| github.com/prometheus/client_golang v1.18.0 | ||
| github.com/prometheus/client_model v0.5.0 | ||
| github.com/prometheus/common v0.46.0 | ||
| github.com/prometheus/common v0.46.1-0.20240206181200-773d5664eb8d |
There was a problem hiding this comment.
Let's not pin to specific commits here. I have just tagged 0.47.0 in common, so you can use it now.
There was a problem hiding this comment.
I don't see the tag on github, do you need to push it?
There was a problem hiding this comment.
I have pushed it an hour ago, and even created a nice release.
Maybe Github is once more very eventually consistent?
https://github.com/prometheus/common/releases/tag/0.47.0
model/textparse/openmetricsparse.go
Outdated
| return val, nil | ||
| } | ||
|
|
||
| func (p *OpenMetricsParser) verifyMetricName(start, end int) error { |
There was a problem hiding this comment.
Apparently you removed verifyMetricName from promparse.go, but left it in here. But this should go, too, shouldn't it?
|
ah yes sorry |
|
Thank you. I think you can do the squashing and DCO'ing now, and then we can merge (barring any new objections from others). |
promql/promql_test.go
Outdated
| func TestEvaluations(t *testing.T) { | ||
| RunBuiltinTests(t, newTestEngine()) | ||
| } | ||
| // func TestEvaluations(t *testing.T) { |
This adds support for the new grammar of `{"metric_name", "l1"="val"}` to promql and some of the exposition formats.
This grammar will also be valid for non-UTF-8 names.
UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.
This does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.
Part of prometheus#13095
Signed-off-by: Owen Williams <[email protected]>
|
🎉 |

This PR adds support for the new grammar of
{"metric_name", "l1"="val"}to promql and some of the exposition formats. This grammar will also be valid for non-UTF-8 names. UTF-8 names will not be considered valid unless model.NameValidationScheme is changed.This PR does not update the go expfmt parser in text_parse.go, which will be addressed by prometheus/common#554.
Correction, this PR does update the promql parser.
Part of #13095