Hacker News new | ask | show | jobs
by psykotic 5014 days ago
Good luck precomputing all those pairwise shortest paths. Storing the table might not be too bad. But standard algorithms like Floyd-Warshall are O(n^3) in the number of vertices. There are faster algorithms based on fast matrix multiplication but the precomputation time would still be prohibitive. Keeping it up to date would be even worse. The construction of a new highway could require updating the shortest paths for an enormous number of pairs.

The economic argument would go like this: The revenue generated from your maps business is proportional to the number of queries actually processed, not the total number of conceivable queries. The queries processed is such a tiny subset of the possible queries that you want your computational expenses to track the former, not the latter.

With a hierarchical shortest paths algorithm, you can still precompute all pairwise shortest paths at the coarser level. For road navigation you might precompute the shortest path between every pair of interstate highway exits in the USA like this paper is doing. In an open-world game like Skyrim you might precompute the shortest paths between all towns and other major hubs and points of interest. That might not yield truly optimal paths. For game use it's close enough and has the benefit of corresponding to how humans naturally navigate.

Sidebar: The old Crash Bandicoot games used precomputed shortest paths in a neat way. Their navigation was based on a triangular mesh, so every triangle had up to three edgewise neighbors. Thus for a navmesh with n triangles, they needed lg(3) n(n-1)/2 <= n(n-1) bits to store the table. For convenience they probably stored this in a redundant form requiring 2n^2 bits = n^2/4 bytes. But with only 64KB of additional memory, this still let them support 512 navmesh triangles per level with lightning-fast path finding.

2 comments

>For road navigation you might precompute the shortest path between every pair of interstate highway exits in the USA like this paper is doing.

Oh, so they are populating a lookup table, not for the entire nodeset but just the tuples denoting interstate highways. All I was advocating was a much larger ( and more expensive) lookup table :) btw, you've edited your response like 6 times now...everytime I try to reply there's new info in there!

> btw, you've edited your response like 6 times now...everytime I try to reply there's new info in there!

Sorry! The lack of preview for HN comments is hell for my iterative writing style.

A lot of us write in an offline text editor and paste the results when we are more-or-less finished (the finish line is an asymptote for people like us!).
Dijkstra is cursing us from beyond the grave.
Increase the "delay" in your settings to get a delay until the comment is visible to others.
Hm, thanks for the nice remark on the game stuff, I kind of like that.

While agree with your primary criticism, I think the principle of locality applies well to this problem. While there are 24 million places (supposedly, never checked that from the paper), I'm positive that most maps queries are served to the areas with higher population density. Therefore, it might make sense to use the look-up table approach for densly populated areas to reduce the search time. (OTOH, the computer scientist in me reminds me that the 7ms might pale in comparison to latency times for mobile phones receiving the map information...)

> Therefore, it might make sense to use the look-up table approach for densly populated areas to reduce the search time.

Yes, that's my favorite way to apply hierarchical path finding. In the two-level path finding system I wrote for my homebrew MUD as a kid, I precomputed all pairwise shortest paths within each zone (a zone had on the order of 100 rooms) and then all pairwise shortest paths between zone-to-zone transitions (a typical zone had around 5 transitions and each transition belongs to two zones). With these tables, here's all you need to do to find a shortest path between a given pair of rooms:

If the two rooms are in the same zone, consult the intra-zone table for their zone and you're done. Otherwise, for both rooms, find the shortest paths to all transitions in their current zone using the intra-zone tables. Let's say there are 5 transitions in both zones. Then for each of the 20 transition pairs, consult the inter-zone table to find the shortest path between them. That gives you a small set of candidate paths (at most 20 in this example) and you pick the shortest among them.

That's 5 + 5 + 20 = 30 fast table lookups and a comparison of 20 numbers to find the shortest path between any pair of rooms.

The storage cost for this is (100 * 5 / 2) * ((100 * 5 / 2) - 1) / 2 bytes ~= 30 kilobytes for the inter-zone table and 100 * (100 * 99 / 2) bytes ~= 500 kilobytes for the intra-zone tables. That wasn't a trivial amount of memory in those days but it wasn't prohibitive either. Compare that to 10,000 * 99,999 / 2 bytes ~= 50 megabytes for directly storing shortest paths between all pairs of rooms. You couldn't store all that in memory and having to read from disk would more than wipe out the 30:1 advantage in the number of lookups versus the two-level approach.

I'm sure you see the analogy with road networks and highways: a transition is like a highway exit, etc. The important point is that there is a way of decomposing the connectivity graph into subgraphs such that only a small number of edges enter and leave each subgraph.

You can see in my calculation that almost all the space is taken up by the intra-zones tables. It would certainly help to compute intra-zone shortest paths on demand with caching, as you suggest.