Hacker News new | ask | show | jobs
by awj 5091 days ago
It probably was the database sharding. If the Solr setup could handle the geo-search-related data without the need for sharding it probably can beat out Postgres with sharding.

Having this exposed through an api that is standardized and maintained by someone else is also nothing to sneeze at. I'd trade a bit of performance for that kind of standardization and turnkey use in the right scenario.

1 comments

The reason we use Solr for this specific task is because PostgreSQL cannot efficiently and quickly merge two index queries (time & space). It can do this to a limited degree, but both of these dimensions potentially match 10s of millions of documents, and PG falls over at this.
So you make the r-tree 3 dimensions (lat,lng,time). PostgreSQL supports this.

I dunno I can't envision Solr being more efficient than a properly designed RDBMS for these situations. If you were integrating a full-text search I'd absolutely believe that to be the case but...

We need independent time & geo searches as well. The indexes are vastly smaller in Solr. We use PostgreSQL extensively and prefer it, so it's not a matter of simply wanting to use something different.
That's very interesting. Could you share your story with the mailing list pgsql-hackers a little bit? The guys who work on indexing are quite active on those lists.

Also, there's some new thing I don't understand super well, sp-gist, do you have any thoughts on that?