Use faster method for calculating distances.#5041
Conversation
3e34b2c to
452eed7
Compare
src/util/coordinate_calculation.cpp
Outdated
| const auto lat1 = static_cast<double>(util::toFloating(coordinate_1.lat)); | ||
| const auto lon2 = static_cast<double>(util::toFloating(coordinate_2.lon)); | ||
| const auto lat2 = static_cast<double>(util::toFloating(coordinate_2.lat)); | ||
| return cheap_ruler_cache[std::lround(lat1 / 10) + 9].distance({lon1, lat1}, {lon2, lat2}); |
There was a problem hiding this comment.
Should probably add assertion here that bucket number is within range for our cache.
|
awesome @danpat |
|
sweet |
|
Aaaaaand numbers are out!
|
|
i like what i see |
|
@danpat if you're ok with it, I will take it on from here working on your tasklist? 😄 |
649e0af to
5ebc145
Compare
|
Wow, that's a massive speedup! Happy how this is turning out :) |
|
This method of calculating the distance has the following problems:
(For comparison, the error in the great-circle distance is about 0.5%.) |
@cffk Are you implying a problem with the method itself, or a bug in the implementation? Those error values are unexpectedly large. Assuming it's correct, https://github.com/mapbox/cheap-ruler#precision shows the expected precision at various latitudes. It never comes anywhere near 10% for up to 1000km lines. Thanks for the other notes though - I've added some items to the checklist. We're fortunate that we don't care much about being a general purpose distance calculator, and we also only care about fairly short geometry (most "segments" from OSM maps are quite short). A lot of the geometry we care about should fall in ranges where this method (supposedly) works, and we can detect and fallback to other methods for the outliers. |
|
@danpat The problem is the latitude binning. The distance between two points on latitude 55° uses the latitude scale factor for 60°. The main term in the latitude scale factor is I noticed you saying that the approximation "falls apart" for longitudes ±179.9°. This isn't the case! You just need to make sure to reduce the longitude difference to [−180°, 180°]. |
|
@danpat yeah, rounding difference of 5 degrees is definitely too much. I'd suggest to 1) calculate the cheap ruler cache lazily; 2) bin much more granularly, e.g. 0.01 deg. |
cffk
left a comment
There was a problem hiding this comment.
I recommend
- centering the bins (now the N and S polar regions are treated the same)
- halving the number of bins using symmetry
Thus, in constructor
step(90.0 / number_of_rulers)
cheap_ruler_cache[n] = mapbox::cheap_ruler::CheapRuler(step * (n + 0.5), mapbox::cheap_ruler::CheapRuler::Meters);
and in getRuler
return cheap_ruler_cache[std::min((int)std::floor(abs(lat)/step), (int)cheap_ruler_cache.size() - 1)];
and instantiate with
CheapRulerContainer cheap_ruler_container(1800);
Finally, in the distance function invoke with the mean latitude
return cheap_ruler_container.getRuler(0.5 * (lat1 + lat2)).distance({lon1, lat1}, {lon2, lat2});
|
FYI, I submitted a pull request to cheap-ruler-cpp to fix the longitude wrapping bug. |
6b69ab6 to
f0d08e5
Compare
|
@cffk thanks so much! I adjusted the changes you suggested in the last commit. |
|
A couple of additional comments:
|
|
@chaupow |
TheMarex
left a comment
There was a problem hiding this comment.
Looks great! Did some performance measurements and they are looking great!
src/util/coordinate_calculation.cpp
Outdated
| std::vector<mapbox::cheap_ruler::CheapRuler> cheap_ruler_cache; | ||
| const double step; | ||
| }; | ||
| CheapRulerContainer cheap_ruler_container(1800); |
src/util/coordinate_calculation.cpp
Outdated
| mapbox::cheap_ruler::CheapRuler &getRuler(const FixedLatitude lat) | ||
| { | ||
| BOOST_ASSERT(step > 1); | ||
| auto bin = (std::abs(static_cast<int>(lat)) - 1) / step; |
There was a problem hiding this comment.
What if the request is for lat = 0 here? This'll could lead to bin = -1 if step is only 2. Or will integer division guarantee this is always 0 or bigger? Either way, an assert that bin >= 0 && bin < cheap_ruler_cache.size() should probably be here.
There was a problem hiding this comment.
Good catch! Didn't realize rounding direction was not actually defined for 1/2.
fix cmakelist
4387030 to
08c5eed
Compare
| | a | 0 | 100 | | ||
| | b | 100 | 0 | | ||
| | | a | b | | ||
| | a | 0 | 100+-1 | |
There was a problem hiding this comment.
Elsewhere (like in the "When I route I should get " tests) the syntax used is "100m +-3", rather than "100+-3". For consistency it might be good to follow this syntax.
08c5eed to
eb92579
Compare
eb92579 to
6f60524
Compare
|
Awesome, thanks a ton @oxidase ! |

Our use of the
haversineDistancefunction is a bit trigonometry heavy. Trig functions are relatively expensive, especially when used a lot.With work on the
distancesbranch, we'll be performing a lot of distance calculations within matrix requests. A 100x100 table with distances without caching will spend over 50% of it's time inhaversineDistnace().This PR switches from using the
haversineDistancefunction, to a newfccApproximateDistancefunction, supplied by the https://github.com/mapbox/cheap-ruler-cpp library.Don't let the name fool you - the
fccApproximateDistancemethod is actually more precise than Haversine, and significantly faster. For short distances (up to a handful of km), which covers almost all the geometry we care about, it has almost 0 difference to the Vincenty method.In simple tests, this change reduced a 100x100 table request with
annotations=distancesfrom 8900ms down to about 4500ms. There's a big speedup here because there are a lot of distance calculations made. Regular/routerequests are unlikely to see such a boost, as the distance calcs they do form a much smaller part of the overall request workload.Before the change, this was the CPU profile in a table request with distances:

after this change:

This PR is not ready to merge, see checklist below.
Tasklist
cheap_ruler_cachebucket lookup without the use oflround(), which is using 2.5% of the total time incalculateEBGNAnnotations. If we have coords in integer format, simple digit truncation and multiplication should do the job.haversineDistanceeverywhere - there is no need to use it anywherefccApproximateDistanceis faster and more accurate.greatCircleDistancemay also be removable for the same reason.mapbox::geometryconversion (note: this may not be necessary - the type conversion steps probably get optimized away under-O3compilation in Release mode).cheap_ruler_cache- make sure we can't underflow/overflow the static array (add an assert, and double-check the math).cheap_ruler_cache- should it be static? How does the current declaration work when this code is linked in multiple times? Does this need to change if we try to get inlining working?fccApprixmateDistancemethod has increasingly bad error as lines get longer, so we should only use it for short geometry. TODO: figure out a threshold to fall back to haversine or maybe vincenty.