-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
Kruskal MST early break for efficiency #8296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hi! This PR adds functionality but I don't have permissions to add a label. |
|
The CI doc build test failure is likely to be unrelated. We've had a couple other PRs recently with this same error. Thanks for this!! |
|
I think it's better to replace the current Kruskal implementation than add a new one that is faster. I don't see the value of having both |
|
Sure thing @amcandio, fixed in the latest commit. |
|
I implemented the requested changes in the latest commit @amcandio. Thanks for the feedback! |
|
Would it be possible to merge this change? If not, what are the fixes needed? Thanks! |
I think this is the main comment not addressed. Are all these changes needed for speed? Which are and which are for other reasons (and what other reasons)? |
|
I ran some ablations based on your feedback:
Average speedup vs upstream_main (GNM n=3k/5k/8k/10k/12k, max/min, 5 seeds × min-of-2 repeats)
Based on this, it seems like the speedup comes from all components, and not just the early break. |
Hey Ori, sorry I missed this reply! Would you mind sharing the different versions for |
531b15e to
a59c708
Compare
|
Hi,
I re-benchmarked every variant with the current version and the deduped (lines 205-226). I'm using a different machine so I couldn't recreate the previous benchmarks exactly; we see slightly different speedup times:
The current version on this branch is the fastest, variant, i.e.: upstream_main+early_break+custom_dsu (deduped). |
|
Interesting, so speed up comes from using early stop and doing path halving in union-find data structure. I missed the path halving thing before! I'm +1 on and adding the early break in this PR. But imho the union-find optimizations should be added in the specific module https://networkx.org/documentation/stable/_modules/networkx/utils/union_find.html. This way we avoid code duplication and we let other algorithms benefit from it. It's probably better doing this in a separate PR. |
93322f2 to
cbe7bbb
Compare
|
Done! |
amcandio
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I approve this PR. Can you also update the description to reflect current code changes and the path halving opportunity for union find?
dschult
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused. This looks like a branch with only the early exit code.
Can you check if this is what you want for the current PR?
|
I updated the title + description now to better reflect the current state of the PR. |
|
Did you press "save" after changing the title? It looks like the title is still the same one. Otherwise this PR looks good. :) |
|
You're right, I fixed the title now. Thanks again. |
rossbar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was hard to digest the original benchmark results as they were tied to the original proposal and not re-runnable. I went ahead and added minimum_spanning_tree to the benchmark suite (see #8403) which clearly demonstrates the benefits of early termination. Here's what I see on my machine with:
asv continuous --bench UndirectedAlgorithmBenchmarks.time_minimum_spanning_tree_kruskal main feat/kruskal-opt-mst
asv compare main feat/kruskal-opt-mstBenchmark results
All benchmarks:
Change Before [cb824d7] After [cbe7bbb] <feat/kruskal-opt-mst> Ratio Benchmark (Parameter) 32.4±0.02ms 32.5±0.06ms 1 benchmark_algorithms.UndirectedAlgorithmBenchmarks.time_minimum_spanning_tree_kruskal('Drug Interaction network') 478±3μs 434±20μs 0.91 benchmark_algorithms.UndirectedAlgorithmBenchmarks.time_minimum_spanning_tree_kruskal('Erdos Renyi (100, 0.1)') - 1.58±0.02ms 948±8μs 0.6 benchmark_algorithms.UndirectedAlgorithmBenchmarks.time_minimum_spanning_tree_kruskal('Erdos Renyi (100, 0.5)') - 2.70±0.01ms 1.47±0ms 0.54 benchmark_algorithms.UndirectedAlgorithmBenchmarks.time_minimum_spanning_tree_kruskal('Erdos Renyi (100, 0.9)') The early termination makes a more and more significant difference as the density of the graph increases, as expected. This LGTM, thanks @oripress @dschult and @amcandio for the thoughtful review!
|
Very cool, thanks for the benchmarks @rossbar ! |
Adds an early-exit (“break”) guard to the Kruskal loop in the MST routines so the algorithm stops as soon as the spanning tree is complete.
We note that further optimizations are possible — for example a path-compression / union-by-rank version of the DSU (“Disjoint Set Union”) structure (commonly used in Kruskal) could give further speed-ups.
However, to keep scope minimal and avoid duplicating code, we defer the DSU overhaul and other sort-tuning optimisations to a future PR.
Original PR Description:
This PR adds an optional optimized backend for minimum/maximum spanning edges:
kruskal_mst_edges_opt, available via algorithm="kruskal_opt" in
minimum_spanning_edges / maximum_spanning_edges.
The PR is based off AlgoTuner generated code using GLM 4.5, see the full trajectory here.
Overall geometric-mean speedup across groups: 1.23×
Note: AlgoTune shows an even greater speedup, when running on larger graphs.
Tests:
Added: networkx/algorithms/tree/tests/test_mst_kruskal_opt.py
Cases cover Graph/MultiGraph, min/max, ignore_nan, partition semantics, and parity (edge count + total weight) vs kruskal.
Full test suite passes locally.