Skip to content

Sort nodes spatially to speed up later queries.#3342

Closed
danpat wants to merge 2 commits intomasterfrom
fix/3311
Closed

Sort nodes spatially to speed up later queries.#3342
danpat wants to merge 2 commits intomasterfrom
fix/3311

Conversation

@danpat
Copy link
Copy Markdown
Member

@danpat danpat commented Nov 18, 2016

Issue

Addresses #3311

This change sorts node IDs on a Hilbert curve, on the assumption that this will improve cache locality during later queries of this data (coordinate data is heavily used during edge-based-graph construction when the guidance code does lots of localize graph exploring).

Tasklist

  • add regression / cucumber cases (see docs/testing.md)
  • review
  • adjust for comments

const auto used_node_id_list_end = used_node_id_list.end();

while (edge_iterator != all_edges_list_end && node_iterator != all_nodes_list_end)
while (edge_iterator != all_edges_list_end && node_iterator != used_node_id_list_end)
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this logic is incorrect, because edge creation hangs for larger graphs. Still needs some work.

@danpat
Copy link
Copy Markdown
Member Author

danpat commented Nov 18, 2016

Note: This is not ready, it doesn't work properly yet.

@danpat
Copy link
Copy Markdown
Member Author

danpat commented Nov 19, 2016

So, while this PR is still slightly buggy, the main point was to try to speed up edge-based-edge generation, which is coordinate-usage heavy.

Based on re-processing California a few times, the results are:

Edge-based-graph generation without Hilbert sorting: 30s
Edge-based-graph generation with Hilbert sorting: 27s

This doesn't seem like a significant gain. In addition, the loss of locality during edge generation in the extraction_containers.cpp leads to that phase taking significantly longer - 5 seconds vs 800 seconds with the current implementation. While my changes are probably less than optimal, it doesn't seem possible to avoid a penalty - the current approach is reasonably well optimized for unused-node removal and edge-coordinate updating.

I'm open to ideas here, or a review to spot if I've made some glaring error (note: there are 5 or 6 tests failing still, so some graph-shape behaviour has changed, I'm just not exactly sure what without a deeper dive).

Given the negligible speedup, I'm inclined to resign this change to the trash.

/cc @daniel-j-h @TheMarex @MoKob

util::FixedLongitude{std::numeric_limits<std::int32_t>::min()});

auto id_iter = external_to_internal_node_id_map.find(edge_iterator->result.osm_target_id);
if (id_iter == external_to_internal_node_id_map.end())
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect this lookup is a big source of the slowdown.

BOOST_ASSERT(edge_iterator->source_coordinate.lon !=
util::FixedLongitude{std::numeric_limits<std::int32_t>::min()});
const auto node_coord = util::Coordinate(all_nodes_list[id_iter->second].lon,
all_nodes_list[id_iter->second].lat);
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this - previously, coordinates were cache-local in the node iterator, no additional lookup was required.
Because the edge IDs aren't hilbert order sorted, we're essentially doing random access to the all_nodes_list.

@TheMarex
Copy link
Copy Markdown
Member

I think it is valuable to figure out why the tests are not passing. If we have behavior in our code that relies on a specific node order, that is not good. Looking at the failure there seems to be something odd going on with the geometry.

Looking at the implementation it would make sense to do the following algorithmic changes to skip most of the overhead you are seeing:

  1. Compute hilbert value in ExtractionCallbacks when creating the ExternalMemoryNode (will not incur a write for very single node in a IO heavy loop)
  2. Keep the current logic that first filters the used nodes (to first reduce the data size)
  3. Sort the nodes by hilbert value (most expensive step, but data volume will be lower)
  4. After the first mapping OSM -> internal was computed use it to compute OSM -> internal -> spatial

The optimization here is to not do the stxxl::sort on all nodes, but only on the relevant ones.

I would also recommend running this on a much bigger dataset before we discard this. The graph for California is small enough that there is a reasonable chance to hit the cache even if you use random order.

@daniel-j-h
Copy link
Copy Markdown
Member

@danpat can you check your numbers again now with the Hilbert space-filling curve improvements?

@daniel-j-h
Copy link
Copy Markdown
Member

Ping - what's the state of this, especially after the Hilbert changes? Still worth investigating?

@danpat
Copy link
Copy Markdown
Member Author

danpat commented Jan 18, 2017

@daniel-j-h We should close this but leave the ticket open. If someone wants to pick this up again, they can use this PR as a starting point.

@danpat danpat closed this Jan 18, 2017
@DennisOSRM DennisOSRM deleted the fix/3311 branch November 6, 2022 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants