| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by rdtsc 4071 days ago

Mandatory reading -- Last year's Call Me Maybe : Elasticsearch

https://aphyr.com/posts/317-call-me-maybe-elasticsearch

I've been hearing a lot of people talk about Elasticsearch lately. I get the same gut feeling I was getting about MongoDB back during the "Webscale" days.

3 comments

bkeroack 4071 days ago

In my experience, Elasticsearch is the single most common source of infrastructure downtime and service failure. It's basically my arch nemesis.

link

willejs 4071 days ago

I am interested to hear a bit more about this, as I find it hard to believe. I have only ran it at pretty small scale - x8 servers, around 300 million documents indexed a day, peak index rate 30k docs/sec. I found that you have to monitor it correctly, tune the JVM slightly (Mostly GC), give it fast disks, lots of ram, and the correct architecture (search, index & data nodes) to get the most out of it. Once I did that it was one of the most reliable components of my infrastructure, and still is. I would recommend chatting to people on the elasticsearch irc, or mailinglist, everyone was a great help to me there.

link

bkeroack 4071 days ago

The full explanation deserves a blog post, but in a nutshell it revolves around the issue that ES contains a huge amount of complexity around a feature that is actually fairly useless (the "elastic" part) or at least difficult to use correctly. I've found that you need to be a deep expert in ES to architect and run it properly (or have access to such expertise) and even then it requires regular care and feeding to maintain uptime. In a short-deadline startup world you probably won't have time for any of that--once it's working it will lull you into a false sense of security and then completely blow up a few weeks/months later.

link

riceo100 4071 days ago

Same here. A single node failure has lead to the whole cluster crashing down around me on more than one occasion.

link

AnkhMorporkian 4071 days ago

Really? Perhaps I was never running it at a large enough scale, but even pre-v1.0 I've basically never had any troubles with it (outside of operation concerns like occasionally confusing query syntax.) Then again, I never had more than 11 servers in the cluster so again I may just have never run into problems at scale.

link

flippyhead 4071 days ago

While I don't necessarily disagree, I do find that this depends entirely on how ES is used. All too often people dive headfirst into using elastic search in ways it really should not be used.

link

lobster_johnson 4071 days ago

It can't be worse than RabbitMQ... can it?

link

thejosh 4071 days ago

I use ES only for search (indexes from a DB), so losing data isn't a massive drama, it's great for my usecase.

link

rdtsc 4071 days ago

That sounds like the indended use. I should qualify my comment, I heard it advocated for a primary data storage.

link

hobofan 4071 days ago

I've only heard of very few cases where people were using ES as primary storage, and even there they acknowledged that they were probably crazy for doing so.

link

digitalzombie 4071 days ago

Yeah, I had an argument with that over at reddit. Where someone advocated ES as an alternative to Cassandra. >___<. I did hopefully, convinced the user otherwise.

link

digitalzombie 4071 days ago

Elasticsearch is just a text search engine base on lucene. You either use ES, Solr, or Lucene library if you want fuzzy search and such.

You really want to use it in tandem with a storage db PostgreSQL, Cassandra, MongDB. Where ES or any lucene based indexer/db would be use for text searching.

I personally like PostgreSQL and Cassandra, would use it in tadem with ES. Solr, last I check was a bit complicated to cluster.

link

threeseed 4071 days ago

Agreed. Cassandra is especially nice if you have the DataStax Enterprise version which allows for seamless integration between the two.

link

m-i-l 4071 days ago

> Solr, last I check was a bit complicated to cluster

SolrCloud, with Zookeeper, is relatively new and not too difficult to set up.

link

capkutay 4071 days ago

Does it still have the issue where you have to take the cluster down to create a new index or modify existing ones?

link

thecage411 4071 days ago

No, search for MergeIndexes or --go-live.

link

PhilipA 4071 days ago

What about storing data for analytics? Wouldn't it be better to use ES than Postgres for that?

link