Hacker News new | ask | show | jobs
by jstarfish 3014 days ago
> tell me what I (we) did wrong at my job we tried using the ELK stack, and it's probably still running but it is such a resource hog.

Part of the problem is what it's promoted for. It's a great drop-in, horizontally-scalable, full-text search engine, that's inexplicably become popular for log ingestion and analysis.

To those ends, I hate it, I hate every bit of it, from the atrocious JSON-based query DSL (seriously thought it was a joke at first) to its unpredictable timeouts, shard storms, mapping conflicts and other problems at scale. Elementary SQL concepts aren't possible ('select someone_elses_poorly_named_key as first_name', nope, you gotta reindex). High-cardinality aggregations fail in spectacular ways. High-anything aggregations fail in spectacular ways. The scroll API returns results unordered. There's no way to properly spec your cluster; the docs explicitly take a trial-and-error approach to design.

It's not just you. Elasticsearch does me no favors with the task of log analysis. I'd sooner normalize and grep a pile of gzipped log files than keep dealing with this mess, but this is the second job I've been at that's built their logging infrastructure on top of it.

> I do not understand why they built Elasticsearch.

"You know, for search."

It's great for searching proper text. Documents, comments, blog posts, etc.

> I've read in a couple places you need like 32GB of RAM[0] just to run this thing to do queries, and having crashed Kibana / Elasticsearch a dozen times I believe it's designed poorly.

You don't necessarily need 32GB to do anything. The required heap size scales with your intended workload, but it's not like you can do a back-of-napkin calculation to figure out the relation. I run a 1GB instance for development.

It uses the memory to do a lot of caching so the queries you throw it are lightning fast. Mongo does something similar (but crappier IMO) with its concept of working sets.

2 comments

> atrocious JSON-based query DSL (seriously thought it was a joke at first)

I could not agree more, I found this in my codebase https://i.imgur.com/44stiHv.png

I've never seen an odder choice of data structure.

I've deployed it a few times. In some cases it randomly spikes to 100% cpu usage until you restart it.

Is there a way to know how much RAM you are going to need for your dataset? I think I was using it for looking up restaurant names from a database of 100,000 and I wanted to factor in misspellings and partial matches.

Not really, depends on the size of the dataset and the complexity of the operations - when did you use it? Sounds like you were on a dodgy build or set up

I think Elastisearches policy on sizing is pay us or a partner to have a look and give you a guesstimate, which is pretty standard