perf: Make A* tie break on lower h-values#813
perf: Make A* tie break on lower h-values#813Dietr1ch wants to merge 2 commits intopetgraph:masterfrom
Conversation
There was a problem hiding this comment.
I figured that adding Sub<M, Output = M> would be ok, but I guess I can move the extra requirement into A*.
Adding this was needed to recover g as f-g
There was a problem hiding this comment.
Moving it to A* will also be a breaking change
There was a problem hiding this comment.
It's a smaller breaking change though. I moved it to scope down the breakage, but I feel that a Measure should be Sub<Self, Output = Self>.
I'm not sure if these breakages are meaningfully different for the library though. It might not be worth making the distinction and having the intermediate breakage.
There was a problem hiding this comment.
In the comment right below this I already stated on how a breaking change might affect the merger of this PR. As also stated there, I would suggest trying to solve the problem without an API change (that is using addition instead of Sub). Unfortunately, the "size" of the breaking change does not matter, as any breaking change counts as breaking and will thus be delayed since a major release would be necessary.
The situation currently is, that probably 0.9 there are gonna be a lot of changes to the current internal implementation of things ( in particular to a lot of traits ) and thus, I am not sure whether this change will even be applicable anymore. Hence, I would either recommend to try to implement it now, or pause the PR until then and see how it fits into the possibly new design with 0.9 :)
There was a problem hiding this comment.
And where is this 0.9 version being drafted? I see the next branch is stale and there's no other branch that stands out.
There's a hack to implement this without Sub nor Neg a the expense of the heap entries being larger that should work. Should I just go with that and leave a note about the possible data trim?
There was a problem hiding this comment.
Yes, I would second the solution you suggested :) At least if the benchmarks show that it is actually worth it ! Otherwise we'll have to just pause this PR and see until 0.9 is around the corner.
Regarding the branches: The next branch is supposed to be the reworked one. As you correctly pointed out, currently it is not being worked on and is thus behind the master branch because the main author/maintainer took a pause and said they'd resume work in the course of this year :)
There was a problem hiding this comment.
It's definitely worth it, but yes, you shouldn't take my word for granted. It'd be nice to have more benchmarks available. I'll try to come up with something different, yet simple, but I'm past my detouring quota already, so it might take me some time.
|
First off, thanks for taking the time to create a PR 🦕 Okay, well changing the Measure trait / the trait bounds on A* would be a breaking change and thus move the adoption of this change to earliest 0.9. With 0.9, the trait system might have changed drastically, so one would have to see how this change might fit in. Would there possibly be a way to express the subtraction with the current definition of Measure (i.e. with addition)? I am thinking about something like keeping track of all three relevant values f,g,h, or only g and h somehow, since we can construct f from both of them? Also the Partial Ord trait bound on K seems redundant since Measure is a subtrait of PartialOrd already. Additionally, we usually call the benchmark files according to the algorithms. I am kind of surprised there is no file for A* yet, but please rename the file accordingly :) Moreover, since there are not other benchmarks yet, I would really like 1/2 more benchmarks which might not be as biased towards the optimization. Feel free to look at something like Dijkstra benchmarks / the build_graph function in benches/common/factories.rs to create some sufficiently large graphs (of course you'd have to check which nodes are connected and sufficiently far away from each other). At last, please rename your PR to fit the conventional commit/PR format. |
|
Since the algorithm has a fairly large number of parameters, it already makes sense to switch to instance implementation. There have already been such proposals, for example #359. It won't help to get rid of the breaking changes, but we could collect all the suggestions about A* and prepare them to be combined for the new major release. |
Good point, I haven't thought about combining the changes with previous efforts 👍 |
I mean the approach to describing an algorithm through a structure. For example, in #359: pub struct AstarInstance<G, F, H, K, IsGoal>
where
G: IntoEdges + Visitable,
IsGoal: FnMut(G::NodeId) -> bool,
G::NodeId: Eq + Hash,
F: FnMut(G::EdgeRef) -> K,
H: FnMut(G::NodeId) -> K,
K: Measure + Copy,
{
graph: G,
is_goal: IsGoal,
edge_cost: F,
estimate_cost: H,
estimate_costs: HashMap<G::NodeId, K>,
}
All non-breaking optimizations can indeed be included in the upcoming minor releases, but so far it seems to me that the most "interesting" thing is still breaking. I will be glad to make A* roadmap after the release of |
|
Ah, that makes sense. Thanks for the explanation 🙏 That kind of roadmap sounds great :) I actually think that this optimization could be implemented without changing the API, no? As I stated above, instead of using subtraction, one could use addition. There would be some overhead, but I'd think that it is negligible in comparison to the speedup :) If I have time, I'll try to work on an example implementing that change, including some benchmarks 🌞 In any case, I agree that we could look at actually incorporating such optimizations after 0.8.2 👍 |
07a7efa to
0579282
Compare
|
The implementation needs to rank with either Alternatively, you could rank with Oh, BTW having a structure representing a run of A* allows to implement My own experimental implementation[^0] looks like this, struct AStarSearch {..}
impl AStarSearch {
pub fn new(..) -> Self {..}
pub fn find_next_goal(&mut self) -> Option<Path> {..}
#[cfg(feature = "stats")]
pub fn stats() -> Stats {..}
}
impl Iterator for AStarSearch {
type Item = Path;
fn next(&mut self) -> Option<Self::Item> {
self.find_next_goal()
}
} |
This works around a breaking change. `K` should be `core::ops::Sub` to avoid storing data that could easily be derived. See the thread for how breaking compatibility would've delay this improvement into 0.9, - petgraph#813 (comment)
0579282 to
6351c9a
Compare
|
I implemented the non-breaking workaround since 0.9 doesn't have a release branch that can take the breaking change. |
|
Nice! Would you please incorporate the changes I requested in the comments above. That is, add more benchmarks than only the maze one. Also, please provide before/after benchmark result comparisons. In particular, a graph which is just a line would be interesting, since in that case the new version should loose the most performance with the unnecessary entries in the binary tree. Once that is done and the benchmark results look good, I'll review the changes once I've got time :) |
Well, given the current implementation uses 2 hashmaps it should still lose to the new implementation as the overhead around the keys is larger than having a larger value. Still interesting though, because that kind of graph stresses some of the internals the most, but the real optimisation comes from being significantly better at exploring the graph and not just faster at picking the next node. From my time doing research around this, measuring the number of nodes expanded was the best measure as time ends up being linear in them, but noisy and dependent on things that were harder to reproduce (machine (cpu, cache, mem and their timings), load, compiler, linker) and compare. |
|
Thinking about breaking APIs, is there a plan to break I'm asking because the other classic problems that are also widely used for benchmarking might require an abstract Graph representation to avoid materialising the entire state space.[0] The classic 15-puzzle shouldn't need building [0]: It seems that |
|
I am not sure this is the right place for such a discussion. Apart from that, I think that I don't quite understand what you mean exactly. If you would like to go into detail about this or make a suggestion please consider creating something like a conversation in the discussion tab 👍 I am glad to review your PR tho once you incorporate the changes I asked for above. Please consider turning this into a draft PR until then :) |
|
I moved the discussion about I think the PR is in a state that can me merged, so I'm not making it a draft. If you feel strongly about adding more problems and benchmarks I think |
|
Okay. In that case someone else will have to give their opinion I guess 👍 Maybe you want to chime in @starovoid ? My point still stands that the benchmarks to compare the performance of the new and old variant should cover different cases and not only the one where we know that it will favor the new implementation 👍 If the benchmarks show improvements in every case I would be glad to review the changes :) I also don't understand the argument of changing the graph type before adding more benchmarks, since if we implement this PR, it will mostly affect people using the graph type as it currently is and the new Astar (in particular most people probably don't implement their own graph types for virtualized/unmaterialized graphs, but instead use the types provided by petgraph). In that case it would be interesting to see exactly that combination, of how the current graph type performs in different scenarios with the modifications of Astar 👍 |
I said this thinking about adding more variety to the graphs. I can do the line graph, or a small road network, but to me the interesting graphs are the ones that are large enough to make the decision-making while exploring the graph more important than details like how quick is the hash table. Representing the classical 16-puzzle I mentioned would take about 50TB-300TB of memory depending on how "clever" (naughty) you get. It's completely unreasonable to try doing it without "abstract" graph support. |
This works around a breaking change. `K` should be `core::ops::Sub` to avoid storing data that could easily be derived. See the thread for how breaking compatibility would've delay this improvement into 0.9, - petgraph#813 (comment)
6351c9a to
46bceb8
Compare
|
Got benchmarks on the "line" graph (actually an aisle since it leads to slightly more ranking work in the heap). There's a 2x speedup on it, even though my change should've made things slightly slower, but this is just due to removing the ~redundant hashmap as I made the ranking adjustment. BTW, I also have an environment commit that adds a simple |
src/algo/astar.rs
Outdated
| let mut visit_next = BinaryHeap::new(); | ||
| let mut scores = HashMap::new(); // g-values, cost to reach the node | ||
| let mut estimate_scores = HashMap::new(); // f-values, cost to reach + estimate cost to goal | ||
| // A node -> (f, h, g) mapping. Should be (f, h) since g can be recovered. |
There was a problem hiding this comment.
| // A node -> (f, h, g) mapping. Should be (f, h) since g can be recovered. | |
| // A node -> (f, h, g) mapping. TODO: With a breaking change this should be (f, h) since g can be recovered if subtraction is available. |
Seems clearer
46bceb8 to
f5fe3d1
Compare
|
Oh, I screwed up by switching back and forth between laptop/workstation believing that my github remote was updated and ended up resolving the comments over an older commit that didn't do the ranking trick that avoids requiring |
a82a56b to
04d4070
Compare
04d4070 to
2b866b3
Compare
|
I re-ran the bench on my workstation using the script I had around. Same remarks apply.
|
|
Alright, well if nothing else changed, LGTM :) 👍 Thanks for taking the suggestions into consideration and all the work you put in so far 🦕 Gonna ping @starovoid , so he can say how to proceed |
|
Well, actually the only other thing I think we could do while we are at it, is add a quickcheck for It would go like right below the |
There's 2 strong arguments IMO against doing this.
That said, there's a weaker version of this, which is testing the heuristic along the optimal path retrieved, which comes almost for free since you already went through it. Along this path Proofs around heuristics often come from the ideas that allow you to have an heuristic in the first place, so I think are rarely needed once you see your heuristic is just an optimal solution in a relaxed version of your problem. If you really want to add this weaker check, I suggest you just add some |
|
Thanks for the response and the insights. I am not sure however, whether we are on the same page here. I am not suggesting to add another unit test. Instead I am suggesting to add a quickcheck like the ones in This would allow the algorithm (A*) to be tested on a multitude of random graphs instead of just the one from the unit test. I do see the point of possibly missing bugs when comparing with the triangle inequality since that should be given with A*, however it could still catch some bugs. However, we could also just compare it with Dijkstra in a Quickcheck setting as well. That would rely on Dijkstra working of course, but that seems like a reasonable assumption I think. Maybe I also misunderstand you but it sounds like you thought I'd suggest adding another unit test. Instead I suggested adding a quickcheck which has the great benefit of systematically testing pseudo-random graphs, which make testing edge-case graphs way easier. |
|
Hey, just to let you know. I would fork and pickup the PR myself in a few days and just implement the last suggestions regarding quickcheck to merge your work, since I think that it is very valuable :) |
|
Closed in favor of #882 |
This PR continues the work from #813. The code there has already been reviewed and the only addition are two quickchecks in quickcheck.rs. So the only part of the code that would need to be reviewed is the changes in quickcheck.rs. --------- Co-authored-by: Dietrich Daroch <[email protected]>
This implements the trick from #806 and adds an empty 2d maze benchmark
Update: This dropped original breaking change of making cost be
core::ops::Sub, so it can be merged earlier.