Hacker News new | ask | show | jobs
by m0th87 4515 days ago
It was two weeks ago, and our startup was on the precipice of a major launch. We had completely rewritten our online publication site, which drives the bulk of our traffic. The product had to be shipped on-time - we had press releases, eager investors and a launch party dependent on it.

A few days before launch, things were not looking good. As admins manipulated articles in preparation for the launch, the servers kept crashing.

In a time-constrained major launch like this, a lot of nasty little hacks build up in the codebase. Our search system for admins was a complete mess. It was a custom solution that worked fine when admins managed a handful of database records, but now that they were managing thousands of articles, it was not scaling at all.

At the 11th hour, we dropped elasticsearch into our infrastructure. It worked like a charm. The servers stopped crapping out, and we launched on time.

Elasticsearch mostly "just works", and we didn't have to worry about complex schema definitions, working with giant complex XML files (hello Solr), or build anything on top to interface between the index and the queries themselves (Lucene). Thanks elasticsearch, you saved us!

2 comments

> Elasticsearch mostly "just works", and we didn't have to worry about complex schema definitions, working with giant complex XML files (hello Solr)

If you were using Solr there are a few operational modes to run in. Config file based or SolrCloud[0]. The latter is more akin the ES in terms of cluster management.

I agree though from an simplicity of deployment perspective at scale ES is has a much lighter learning curve.

[0] https://cwiki.apache.org/confluence/display/solr/SolrCloud

SolrCloud is nothing like ES in terms of management: you end up running a separate zookeeper service with even more files which all have to be configured correctly just to get it running and you have to micromanage shard allocation to ensure that you can add nodes in the future but also not have it intentionally deadlock when a server fails and you no longer have enough nodes for a quorum. All of this happens with the usual contempt for sysadmins where things you need to know (“refusing to process requests”) won't be logged but a bunch of startup boilerplate will be, and simply configuring logging correctly requires (IIRC) editing two XML files and a properties file.

`java -jar elasticsearch.jar` does a better job and that's basically all it takes. I'm planning to switch as soon as https://github.com/elasticsearch/elasticsearch/issues/256 lands.

I lost count of the +1s. That issue must have around +180. :)
Did you try/consider Sphinx? It's simple and it's quite fast. I'm using that and I'm pretty happy with it, but I might investigate ES at some point to see if I can squeeze a bit more speed out of it.
You might also take a look at the search functionality in Riak. I've run both Solr and ES, the latter at significant scale, and I'm leaning more towards Riak going forward. The difference is mainly convenience, so not a reason to switch off something that's working already.
Hadn't considered Riak, but I can see that it has some full-text search capabilities. Any idea about its features and how it compares in performance, as a raw search index?
Riak 2.x uses Solr to index values from K/V with AAE. If you're interested in how using it looks, I wrote a post using geospatial data here[1].

[1]: http://www.christopherbiscardi.com/2014/02/07/geospatial-ind...

If it's just Solr underneath, then why is the pesudo-Solr API implementation not a complete implementation? Something to do with each node being an isolated Solr instance maybe?
There is backward compatibility with the old Riak Search (which wasn't Solr based) intended to not break old applications, but you can query with any currently implemented Solr client afaik.
I don't know of any publicly available relative raw performance benchmarks, and haven't done any myself. My guess is that the compelling features would be more in the realm of node operations and recovery from node failures.

Edit: Apparently my Riak knowledge is dated now anyway. It looks like I have some research to do myself, but it's pretty exciting stuff.

As far as I can tell, Sphinx has a more involved setup process. Also our search runs against JSON documents, which seems to suit Elasticsearch better than Sphinx. I might be wrong on both counts though, we really didn't look into Sphinx enough to give it a fair appraisal.
Sphinx is a bit too 1:1 - it only works as a single server, not a cluster.
Well, you could simply have multiple instances running on different nodes. It's manual work, but by no means impossible. In my setup, I have a sphinx server running on the same node as my web server (Which is the consumer of the search). So they scale with each other. For more advanced uses, it's probably not adequate, but it's not a big concern of mine.