I've got this on a site I'm building that offers a sort by distance option. There's definitely a performance hit. I'm going to look into 'dumber' ways of filtering out the data prior to running this function. Maybe start by including state in the where clause, for example.
Any other ideas?
One solution I saw used more simple arithmetic to calculate a range of coordinates within levels of distance. That could be pre-cached, but it's a lot less accurate.
(Edit: to be more specific, you can get a pretty good distance measurement using Geohash and comparing strings. Obviously, indexing strings is something databases do well. The exact distance a single character corresponds to depends on longitude & latitude, but there are lookup tables for that. There are also edge conditions to be aware of which may affect your application)
Thanks for the pointer! This looks interesting. The edge conditions seem like they might pose a problem. I'll have to check out how often it would occur. Maybe the geospatial indexes are a better bet. It looks like MongoDB supports them also. Good excuse to try that out.
Using a B-Tree on a Geohash (like MongoDB does) is a bit more efficient that just indexing min/max values, but not by much. MySQL, PostgreSQL and even SQLite have R-Tree indices that perform 10x better.
If you are prepared to introduce new technology specifically to solve this problem, then you should take a look at LocalLucene, too: http://www.gissearch.com/locallucene
Hash the points into large tiles. Only calculate the nearby tiles, then find the items that are in the list of tiles (which is faster, due to being indexed.) Then use Haversine or whatever to filter.
true, the distance calculation must NOT be in the WHERE clause if you want to use indexes (and you want).
What I'm doing, given a max distance and a search point, is to calculate the bounding box in which I want to search in filter results with
WHERE lat BETWEEN lat_min AND lat_max AND lng BETWEEN lng_min AND lng_max
Calculating latitude min/max is trivial knowing that 1 latitude degree is 111.2KM. Longitude is a bit more convoluted because longitude degree size changes moving north/south.
Any other ideas?
One solution I saw used more simple arithmetic to calculate a range of coordinates within levels of distance. That could be pre-cached, but it's a lot less accurate.