Skip to content

Conversation

@oilbeater
Copy link

@oilbeater oilbeater commented Dec 23, 2025

I got the idea from unnecessarily-nested-maps. Using a flat map reduces one hash lookup. For the CardManyToOne case, it also avoids creating inner maps, which reduces the extra memory overhead of map headers.

Benchmark results:

goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/promql
cpu: AMD EPYC 7B13
                                                                                                               │ /tmp/before.txt │           /tmp/after.txt           │
                                                                                                               │     sec/op      │   sec/op    vs base                │
JoinQuery/expr=rpc_request_success_total_+_rpc_request_error_total/steps=10000-8                                      5.770 ± 3%   5.577 ± 3%   -3.33% (p=0.029 n=10)
JoinQuery/expr=rpc_request_success_total_+_ON_(job,_instance)_GROUP_LEFT_rpc_request_error_total/steps=10000-8        6.945 ± 3%   5.881 ± 2%  -15.31% (p=0.000 n=10)
JoinQuery/expr=rpc_request_success_total_AND_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                   1.021 ± 1%   1.016 ± 2%        ~ (p=0.247 n=10)
JoinQuery/expr=rpc_request_success_total_OR_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                    2.275 ± 1%   2.241 ± 4%        ~ (p=0.105 n=10)
JoinQuery/expr=rpc_request_success_total_UNLESS_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                2.231 ± 3%   2.176 ± 2%   -2.44% (p=0.000 n=10)
geomean                                                                                                               2.907        2.768        -4.77%

                                                                                                               │ /tmp/before.txt │            /tmp/after.txt            │
                                                                                                               │      B/op       │     B/op      vs base                │
JoinQuery/expr=rpc_request_success_total_+_rpc_request_error_total/steps=10000-8                                    54.41Mi ± 0%   54.42Mi ± 1%        ~ (p=0.529 n=10)
JoinQuery/expr=rpc_request_success_total_+_ON_(job,_instance)_GROUP_LEFT_rpc_request_error_total/steps=10000-8    1885.44Mi ± 0%   54.15Mi ± 1%  -97.13% (p=0.000 n=10)
JoinQuery/expr=rpc_request_success_total_AND_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                 38.23Mi ± 1%   38.10Mi ± 1%        ~ (p=0.539 n=10)
JoinQuery/expr=rpc_request_success_total_OR_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                  38.48Mi ± 1%   38.24Mi ± 1%        ~ (p=0.753 n=10)
JoinQuery/expr=rpc_request_success_total_UNLESS_rpc_request_error_total{instance=~"0.*"}/steps=10000-8              38.42Mi ± 1%   38.42Mi ± 1%        ~ (p=0.955 n=10)
geomean                                                                                                             89.67Mi        44.00Mi       -50.93%

                                                                                                               │ /tmp/before.txt │           /tmp/after.txt            │
                                                                                                               │    allocs/op    │  allocs/op   vs base                │
JoinQuery/expr=rpc_request_success_total_+_rpc_request_error_total/steps=10000-8                                     562.7k ± 0%   562.7k ± 0%        ~ (p=0.054 n=10)
JoinQuery/expr=rpc_request_success_total_+_ON_(job,_instance)_GROUP_LEFT_rpc_request_error_total/steps=10000-8     20567.7k ± 0%   562.7k ± 0%  -97.26% (p=0.000 n=10)
JoinQuery/expr=rpc_request_success_total_AND_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                  306.9k ± 0%   306.9k ± 0%        ~ (p=0.087 n=10)
JoinQuery/expr=rpc_request_success_total_OR_rpc_request_error_total{instance=~"0.*"}/steps=10000-8                   306.9k ± 0%   306.9k ± 0%        ~ (p=0.926 n=10)
JoinQuery/expr=rpc_request_success_total_UNLESS_rpc_request_error_total{instance=~"0.*"}/steps=10000-8               306.9k ± 0%   306.9k ± 0%        ~ (p=0.896 n=10)
geomean  

Which issue(s) does the PR fix: None

Does this PR introduce a user-facing change?

NONE

@oilbeater oilbeater changed the title perf(promql): update matchedSigs to a flat map perf(promql): update matchedSigs to a flat map reduce memory by 97% for GROUP_LEFT Dec 23, 2025
@oilbeater
Copy link
Author

@ricardbejarano, can you help review this? I think the CI failures are not related to this pull request.

@ricardbejarano
Copy link
Contributor

Me?

@oilbeater
Copy link
Author

@bboreham it's a performance related PR. Could you help review it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants