Use slices.Sort() instead of sort.Strings() or sort.Ints()#3690
Use slices.Sort() instead of sort.Strings() or sort.Ints()#3690
slices.Sort() instead of sort.Strings() or sort.Ints()#3690Conversation
This is partially true; as I said in the Prometheus PR.
|
bboreham
left a comment
There was a problem hiding this comment.
I haven't benchmarked the changes in this PR
I don't want to encourage this stance. Measuring is always better than assuming.
A quick search of the source code suggests there might not be an existing one covering any of these changes, which in itself is interesting. We should be interested in the performance of LabelNames, for instance.
I perhaps wouldn't have changed the tests; they aren't very performance-sensitive and it can be nice to have some diversity of algorithm between implementation and test. But then your lint rule wouldn't work.
| # Note that we don't automatically suggest replacing sort.Float64s() with slices.Sort() as the documentation for slices.Sort() | ||
| # at the time of writing warns that slices.Sort() may not correctly handle NaN values. |
There was a problem hiding this comment.
Do we sort any slices of float64?
There was a problem hiding this comment.
Not at present.
I think this changed with the Go 1.19 release - the release notes mention changing the algorithm for the |
|
Thanks for the feedback @pracucci and @bboreham.
Completely understandable, I felt a little uneasy doing this (and why I wanted to open this for discussion). Perhaps rather than my heavy-handed approach, we can target a handful of areas that are most likely to benefit from this and create benchmarks for them as part of this?
Happy to change this back for the tests.
I'm very open to feedback on this - it was my understanding that given the only change between the two is the removal of the interface call, using |
|
@charleskorn I would keep a very practical approach, benchmarking few of the changes in the hot path, but not all of them. If we don't find regressions on few of them, I would just move on and merge. I think this PR touches somewhat hot paths in the following functions. Do we have any benchmark touching any of those? |
580fb36 to
e15f350
Compare
|
tl;dr: for the four instances I benchmarked, we see improvements of anywhere from 2.2% to 16%, and none regressed. I'm happy to keep benchmarking the other changes, but the results are universally positive so far. These are some results from benchmarks that we already had or were easily able to be changed to cover some of the affected methods:
|
pracucci
left a comment
There was a problem hiding this comment.
Thanks @charleskorn for the benchmarks. I'm happy to move forward and merge it. Can you rebase it, please?
3577aad to
2fbb1b5
Compare
#3639 (comment), prometheus/prometheus#11054, and prometheus/prometheus#11318 all show substantial performance improvements when using slices.Sort over sort.Strings. I've also added a linting rule to catch usage of sort.Strings and sort.Ints in the future. (We don't currently use sort.Ints anywhere.) We could also replace sort.Float64s with slices.Sort, however, this requires more careful consideration as the documentation for slices.Sort warns that it may not handle NaN values correctly.
2fbb1b5 to
a91c415
Compare
…plars with different sets of labels.
a91c415 to
2a360cf
Compare
…fana#3690) * Use slices.Sort instead of sort.Strings or sort.Ints. grafana#3639 (comment), prometheus/prometheus#11054, and prometheus/prometheus#11318 all show substantial performance improvements when using slices.Sort over sort.Strings. I've also added a linting rule to catch usage of sort.Strings and sort.Ints in the future. (We don't currently use sort.Ints anywhere.) We could also replace sort.Float64s with slices.Sort, however, this requires more careful consideration as the documentation for slices.Sort warns that it may not handle NaN values correctly. * Remove unnecessary comment. * Rename benchmarks file. * Add benchmark for BinaryReader.LabelNames(). * Add benchmark for LabelValues(). * Modify mergeExemplarQueryResponses benchmark to exercise merging exemplars with different sets of labels.
What this PR does
This PR was sparked by #3639 (comment). It replaces all usage of
sort.Strings()withslices.Sort(), and adds a linting rule to enforce this in the future.I'm opening this as a draft for discussion because:
golang.org/x/exp/slicesis still considered an experimental package (although Prometheus is now using it widely)sort.Strings()withslices.Sort()has seen substantial improvements, see below)Performance
#3639 (comment) and prometheus/prometheus#11054 show substantial performance improvements when using
slices.Sort()oversort.Strings(). This improved performance is becauseslices.Sort()does not require the overhead of an interface call for every comparison. (If we were using Go 1.18 or earlier, we'd also benefit from the improved pdquicksort sorting algorithm inslices, but this has been implemented in thesortpackage as of Go 1.19.)Given the pattern of significant performance improvements in a number of scenarios, I haven't benchmarked the changes in this PR. If we do want to take a more measured approach, I'm very open to that feedback.
Future work
We could also replace
sort.Float64s()withslices.Sort(), however, this requires more careful consideration as the documentation forslices.Sort()warns that it may not handle NaN values correctly.Which issue(s) this PR fixes or relates to
(none)
Checklist
CHANGELOG.mdupdated - the order of entries should be[CHANGE],[FEATURE],[ENHANCEMENT],[BUGFIX]