| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bkeroack 4065 days ago
	In my experience, Elasticsearch is the single most common source of infrastructure downtime and service failure. It's basically my arch nemesis.

5 comments

willejs 4064 days ago

I am interested to hear a bit more about this, as I find it hard to believe. I have only ran it at pretty small scale - x8 servers, around 300 million documents indexed a day, peak index rate 30k docs/sec. I found that you have to monitor it correctly, tune the JVM slightly (Mostly GC), give it fast disks, lots of ram, and the correct architecture (search, index & data nodes) to get the most out of it. Once I did that it was one of the most reliable components of my infrastructure, and still is. I would recommend chatting to people on the elasticsearch irc, or mailinglist, everyone was a great help to me there.

link

bkeroack 4064 days ago

The full explanation deserves a blog post, but in a nutshell it revolves around the issue that ES contains a huge amount of complexity around a feature that is actually fairly useless (the "elastic" part) or at least difficult to use correctly. I've found that you need to be a deep expert in ES to architect and run it properly (or have access to such expertise) and even then it requires regular care and feeding to maintain uptime. In a short-deadline startup world you probably won't have time for any of that--once it's working it will lull you into a false sense of security and then completely blow up a few weeks/months later.

link

riceo100 4064 days ago

Same here. A single node failure has lead to the whole cluster crashing down around me on more than one occasion.

link

AnkhMorporkian 4065 days ago

Really? Perhaps I was never running it at a large enough scale, but even pre-v1.0 I've basically never had any troubles with it (outside of operation concerns like occasionally confusing query syntax.) Then again, I never had more than 11 servers in the cluster so again I may just have never run into problems at scale.

link

flippyhead 4064 days ago

While I don't necessarily disagree, I do find that this depends entirely on how ES is used. All too often people dive headfirst into using elastic search in ways it really should not be used.

link

lobster_johnson 4064 days ago

It can't be worse than RabbitMQ... can it?

link