tsdb.PostingsForMatchers: Optimize inverse matching#14144
tsdb.PostingsForMatchers: Optimize inverse matching#14144bboreham merged 7 commits intoprometheus:mainfrom
Conversation
849e850 to
3f323ec
Compare
6e0345a to
b1c1b9f
Compare
b7288c5 to
1625629
Compare
|
Putting into draft mode while doing some analysis, and discussing with @colega. |
1625629 to
80f991e
Compare
80f991e to
b5944c5
Compare
|
Update: Moving back to draft mode while I execute the benchmarks again, since I figured out another optimization (really good perf improvements so far). Another update: Benchmarks mostly look good (updated in PR description), except a few head |
b3ed0c3 to
8d9cae8
Compare
0bd3191 to
6ad0c67
Compare
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
902618e to
20f4b97
Compare
colega
left a comment
There was a problem hiding this comment.
The logic here LGTM, and it would also improve the performance with some other changes I'm considering, but I would consider adding a new explicit method instead of changing the behaviour of the existing one, to stay on the safe side.
Signed-off-by: Arve Knudsen <[email protected]>
cf450cd to
372126a
Compare
Thanks @colega, done! |
|
|
||
| res := vals[:0] | ||
| // If the match before inversion was !="" or !~"", we just want all the values. | ||
| // If inverse matching the empty string, we just want all the values. |
There was a problem hiding this comment.
To me this seems harder to understand than the original comment which spelled out the cases we are trying to handle:
// If the match before inversion was !="" or !~"", we just want all the values.
Is there a version better than both?
There was a problem hiding this comment.
// If the match before inversion was !="" or !~"", we just want all the values.
The old comment is misleading though isn't it, since the cases being inverted are =~"" and ="". That's why I thought my comment was more correct.
Is it better if I write something like the following?
// If the matcher being inverted is ="" or =~"", we just want all the values.There was a problem hiding this comment.
Yes, that is a lot better. I didn't detect that the previous comment was opposite to yours.
| return EmptyPostings() | ||
| } | ||
|
|
||
| postingsCount := 0 |
There was a problem hiding this comment.
Is "Count" a good word to use for a guesstimate?
Is zero a good guess for the non-nil case?
There was a problem hiding this comment.
Would you prefer "postingsCap"?
There was a problem hiding this comment.
"Cap" suggests it can't go any higher. How about postingsEstimate ?
There was a problem hiding this comment.
Applied your suggestion, PTAL.
| PostingsForLabelMatching(ctx context.Context, name string, match func(value string) bool) index.Postings | ||
|
|
||
| // PostingsForAllLabelValues returns a sorted iterator over all postings having a label with the given name. | ||
| // If no postings are found having at least one matching label, an empty iterator is returned. |
Signed-off-by: Arve Knudsen <[email protected]>
Signed-off-by: Arve Knudsen <[email protected]>
Same as prometheus#15427 but for the new method added in prometheus#14144 Instead of allocating each ListPostings one by one, allocate them all in one go. Signed-off-by: Oleg Zaytsev <[email protected]>
Same as #15427 but for the new method added in #14144 Instead of allocating each ListPostings one by one, allocate them all in one go. Signed-off-by: Oleg Zaytsev <[email protected]>
Since dot is matching newline now, `l=~".+"` is "any non empty label value", and prometheus#14144 added a specific method in the index for that so we don't need to run the matcher on each one of the label values. Signed-off-by: Oleg Zaytsev <[email protected]>
Since dot is matching newline now, `l=~".+"` is "any non empty label value", and #14144 added a specific method in the index for that so we don't need to run the matcher on each one of the label values. Signed-off-by: Oleg Zaytsev <[email protected]>
Simple follow-up to prometheus#13620. Modify `tsdb.PostingsForMatchers` to use the optimized tsdb.IndexReader.PostingsForLabelMatching method also for inverse matching. Introduce method `PostingsForAllLabelValues`, to avoid changing the existing method. The performance is much improved for a subset of the cases; there are up to ~60% CPU gains and ~12.5% reduction in memory usage. Remove `TestReader_InversePostingsForMatcherHonorsContextCancel` since `inversePostingsForMatcher` only passes `ctx` to `IndexReader` implementations now. Signed-off-by: Arve Knudsen <[email protected]>
…for-matcher [PERF] TSDB: Optimize inverse matching (prometheus#14144)
Simple follow-up to #13620. Just modify
tsdb.PostingsForMatchersto use the optimizedtsdb.IndexReader.PostingsForLabelMatchingmethod also for inverse matching.The performance is much improved for a subset of the cases; there are up to ~60% CPU gains and ~12.5% reduction in memory usage. Please consult benchmark statistics below.
Removing
TestReader_InversePostingsForMatcherHonorsContextCancelsinceinversePostingsForMatcheronly passesctxtoIndexReaderimplementations now.Benchmark results
PostingsForMatchers benchmark stats
labelValuesWithMatchers benchmark stats