Hacker News new | ask | show | jobs
by troels 4515 days ago
Did you try/consider Sphinx? It's simple and it's quite fast. I'm using that and I'm pretty happy with it, but I might investigate ES at some point to see if I can squeeze a bit more speed out of it.
3 comments

You might also take a look at the search functionality in Riak. I've run both Solr and ES, the latter at significant scale, and I'm leaning more towards Riak going forward. The difference is mainly convenience, so not a reason to switch off something that's working already.
Hadn't considered Riak, but I can see that it has some full-text search capabilities. Any idea about its features and how it compares in performance, as a raw search index?
Riak 2.x uses Solr to index values from K/V with AAE. If you're interested in how using it looks, I wrote a post using geospatial data here[1].

[1]: http://www.christopherbiscardi.com/2014/02/07/geospatial-ind...

If it's just Solr underneath, then why is the pesudo-Solr API implementation not a complete implementation? Something to do with each node being an isolated Solr instance maybe?
There is backward compatibility with the old Riak Search (which wasn't Solr based) intended to not break old applications, but you can query with any currently implemented Solr client afaik.
I don't know of any publicly available relative raw performance benchmarks, and haven't done any myself. My guess is that the compelling features would be more in the realm of node operations and recovery from node failures.

Edit: Apparently my Riak knowledge is dated now anyway. It looks like I have some research to do myself, but it's pretty exciting stuff.

As far as I can tell, Sphinx has a more involved setup process. Also our search runs against JSON documents, which seems to suit Elasticsearch better than Sphinx. I might be wrong on both counts though, we really didn't look into Sphinx enough to give it a fair appraisal.
Sphinx is a bit too 1:1 - it only works as a single server, not a cluster.
Well, you could simply have multiple instances running on different nodes. It's manual work, but by no means impossible. In my setup, I have a sphinx server running on the same node as my web server (Which is the consumer of the search). So they scale with each other. For more advanced uses, it's probably not adequate, but it's not a big concern of mine.