Hacker News new | ask | show | jobs
by hardwaresofton 3030 days ago
Yeah so has anyone actually tried to get ElasticSearch up and running lately? I just tried and had a terrible time, despite the fact that I was using ElasticSearch + Kibana, and it was dockerized, and it was on Kubernetes (there's more complexity, yes, but all those tools make deployment simpler once you understand them, not harder -- writing a pod resource config to get a thing running means I don't have to run around my system changing settings, I just put all of it in one place). XPack was just another stumbling block while trying to get everything running.

The combination of lack of documentation, inconsistent/changed configuration (ENV vs YAML vs values that just don't exist anymore), breaking changes between versions that rendered Kibana completely useless, and the recent (?) removal of plugins that expose web APIs (so I couldn't use something like elastic-head. This is all in Kubernetes btw -- maybe it's just that I wasn't smart enough to get it done, but it's so easy to write functional (if not well-configured) configurations for other databases, I was at a loss for words when nothing I tried worked right.

I got so angry trying to set up ElasticSearch that making a F/OSS competitor is now #2 on my list of projects-to-do-next. I'm sure the thought is naive but I need to find out for myself that there's no easier way.

Imagine if the team behind Prometheus had focused on search instead of metrics? That's the kind of tool I want to use. A tool as focused, easy to start, clearly documented, and straightforward as prometheus.

12 comments

"making a F/OSS competitor"

So, Solr? Good luck getting SolrCloud set up on Kubernetes. ;-)

More seriously though, my answer to "has anyone actually tried to get ElasticSearch up and running lately?" is yes. I just worked on spinning up a cluster (using docker) at my current job. At my last two jobs I also managed ElasticSearch (without docker). There are plenty of gotchas with ElasticSearch, but I've never found the initial setup to be a challenge. To be fair, I've never touched X-Pack.

Call be insane +/- naive, but I was actually thinking of "just" gossiped/quorumed SQlite+FTS5.

In the end I got elastic search running, but it wouldn't connect to Kibana properly. I exaggerated too much -- much of my frustration was with ES not working properly with Kibana. I kept notes on what went wrong/what I was struggling with but I don't even want to look at them now, they'll be in a blog post someday

You'd need to handle concurrent writes, so something like a WAL, so why not build on RocksDB?

And okay, quorum, and sure there are a lot of Raft libs out there, but it's a bit harder than "new Cluster(Consistency.QUORUM)" :)

The thing is, I don't want to build search myself -- SQLite has a WAL (of course), runs in memory if you want (of course RocksDB has less holding it back from utilizing memory even more efficiently than SQLite could), and most of the creature comforts of SQLite, and I can lean on SQlite FTS search.

All I have to get right is the quorum (I'm actually thinking optimistic gossip with something like swim over a quorum with paxos/raft), and the sharding, and replication -- and that stuff has been worked through by people much smarter than me already.

The formula I think will work is basically SQLite + SWIM/Raft + consistent hashing algo + optimistic replication + optimistic rebalancing. Just about 100% of the things on that list I don't have to think too hard to implement, and should be performant in the happy case (where n/2 nodes are up and healthy and relatively performant)

Recently saw a talk from fosdem 2018 (https://fosdem.org/2018/schedule/event/datastore/) about a project called Timbala, learned a bit from it (for example, assumed everyone was just using paxos/raft but SWIM evidently is used by consul)

Here's a quick plug for the project I'm working on at the moment: https://github.com/jetstack/navigator

It's a framework for managing databases on kubernetes, with initial support for elasticsearch and cassandra. It's still early in development, but any feedback would be great.

kube-lego & certmanager are amazing, thank you for the work you put in @ jetstack.

I want to give your tool a try, but a custom api server seems like a lot in the way of complexity (I thought operators were at most beefy controllers?, do custom api servers fit in the "operator" pattern?) -- and I literally only want to run Elastic so I can complete the EFKK stack (and try it out).

FWIW, I've done the same recently (deploy ES + Kibana on K8s) and it pretty much just worked. Statefulset, EBS volume claims, official Docker images.

Didn't use XPack or anything fancy though, haven't updated it and the only addon I'm running is https://github.com/lmenezes/cerebro

yeah it's becoming increasingly clear that I must have tripped myself up/got frustrated too fast.

When you set up cerebro, how did you set up the CORS headers? Did you go with allow "*"?

I totally agree...It's crazy that they don't even offer a docker-compose file of sorts to being all their own tools together to demo the power of their own tools.

I recently wanted to see ELK in action...and it took me a few hours to set it all up and configure everything together with just basic docker and docker-compose. It really should not be that hard :/

We try to get you most of the way there with https://github.com/elastic/stack-docker/blob/master/docker-c.... It takes care of x-pack as well.
My team created a helm chart for this. EFK soup to nuts with x-pack. I'll see if we can't publish it.
Since you've done it, perhaps you could share :-)
I will soon! (I did it for a work related project so I want to make sure that I go through a few formalities to make the repo public).

It's a pretty cool repo - includes templates for ELK on K8S, ECS and Docker Swarm (Compose).

I'm working on testing some HA on it next week!

Graylog is a good alternative, check it out.

If OpenShift didn’t do the heavy lifting of deploying and securing Elasticsearch I wouldn’t be using it at all, and because of that mess I actually use Graylog in my lab at home because it’s substantially less of a pain in the ass and security isn’t a feature locked behind a proprietary license or writing your own proxy.

Graylog is exactly what I was reaching to go with (I've had a similar experience and was blown away and delighted by how easy it was to use) with but it's a bit heavy weight -- they (supiciously) don't have min requirements anywhere, the only scrap i could find was on the graylog open stack docs where they suggest you have 4GB of RAM free.

The machine I'm running on isn't small, I have the memory, but it just feels like a slipperly slope.

Also, Graylog = Java + Mongo + ES and I'm almost philosophically opposed to using Mongo for personal reasons (this is a personal project so I can afford to have some self-defeating bias).

The prebuilt Graylog virtual machine appliance (OVA) defaults to a pitiful amount of RAM (I think 512MB? 1GB?) and we used it in production successfully for a very long time in this configuration. We bumped it up but just because it seemed like a good idea, not because the memory was giving us any trouble. From our Graylog dashboard currently:

> The JVM is using 637.8MB of 980.1MB heap space and will not attempt to use more than 1.4GB

It also defaulted to a single vCPU, which seemed to be fine. It seems like Graylog can scale down pretty well if needed.

Does this include what you're giving Mongo + ElasticSearch? The graylog process isn't all I'm worried about, it's kind of the combination of the three.

Regardless, I'm probably going to just use Graylog then -- I'm not running a large environment by any means, and while I've been at a company where graylog was used in production (which is where I heard about it), people often complained about it hogging resources. Time has passed, and I'm sure that if it's good enough for you, it's more than good enough for me (especially since I'm not running anything "in production").

I still want to get the EFKK stack up and running though, right now there's basicaly two choices, ELK/EFK or Graylog or some hosted option (splunk, sumologic?, others), I'd like to at least stand up both choices once and get a feel for them (and I've done Graylog before).

Splunk’s not a bad piece of software, I just prefer open source options before proprietary solutions where feasible (which is why I don’t use EFK, I refuse to pay money for security and I think it’s bullshit that Elastic has made that part of their business model with the xpack) but for small environments the free version can get you far.
Not in any way affiliated with Elastic but XPack is now included in Elastic by default, so there's that -- of course it does say something that they included it in their enterprise offering first.

Same here on the open-source-first mentality. I also managed to get the EFK stack working so now I don't feel bad actually choosing Graylog in the long run.

Graylog doesn't require tons of memory in my experience, it always benefits from more as your logs grow - but that's just a fact of life when it comes to any kind of database. I've run it on 2GB of RAM before (this is just the smallest amount I ever give a VM because that's what it takes to netinstall CentOS 7 these days) without issue on smaller amounts of logs (10-20MB/day).

I'm not a fan of MongoDB myself, but Graylog uses it as not much more than a distributed configuration store so I just begrudgingly accept it.

Would you mind sharing the resources alotted to mongo + elastic search? I'd consider those under the umbrella of Graylog
I don’t have statistics as it was used in my lab at home which I have recently torn down and begun rebuilding (new servers, new hypervisor and not enough of a crap given to v2v the VM’s I had instead of reinstalling).

From memory though, MongoDB didn’t use much since it mostly stored configuration for Graylog, the Graylog processes themselves took up a couple hundred MB and elasticsearch ate up everything I allowed it to (typical behavior of a database though).

I didn’t bother tweaking any of the settings and just relied on memory pressure of the VM everything ran on to limit resource usage, if you’re keeping lots of history and need fast access to it then obviously you’d need to give ES more RAM to work with.

Thanks for sharing -- I wasn't aware that Graylog only used mongo for the configuration information -- sounds like they're using it as a synchronization option... Wonder if they're working on any alternatives like etcd or even kubernetes-native synchronization options... After a little looking it looks like the answer is "no" (https://community.graylog.org/t/will-mongodb-ever-be-replace...)
It would be amazing to see a Golang based search project startup with a clean codebase as a starting point. I would work on that project for sure!
I could agree with that. Probaĺly is this a option to look at: https://github.com/blevesearch/bleve
>Yeah so has anyone actually tried to get ElasticSearch up and running lately?

Actually, yes. I just finished doing our migration from ELK 1.7 to 6.1.3.

We're using installs direct on VMs (rather than docker), and for that we push the configuration/install using Ansible. Their Ansible role[1] works reasonably well for installing Elastic. The Kibana and Logstash configurations were done using regular RPM install from the repo.

[1] https://github.com/elastic/ansible-elasticsearch

Well clearly I didn't try hard enough -- the ansible roles look perfectly reasonable. A quick look through the notes I took and my biggest problems were with:

- Close versions of ES+Kibana not working together

- maxConcurrentShardRequests not being set on Kibana for some reason (so when I got them talking, a silly query parameter was holding everything up)

- I wasted a ton of time due to some files from a failed installation causing an obtuse error -- I think it was a NoShardAvailableActionException

> Well clearly I didn't try hard enough

Well, I had the advantage in that I already knew I wasn't touching it on Docker with a ten foot pole, and we use Ansible, so that made my google search pretty obvious.

> Close versions of ES+Kibana not working together

Yep, that's a pain in the arse, and a trap for inexperienced players still.

Also of note is that the latest versions available through the package repository are not the same as the latest supported by the Ansible role. The ansible role will install a specific version of Elastic, you'll have to be careful to take note and synchronise that with the versions of Logstash and Kibana you install. (This is why we're on 6.1.3)

> - maxConcurrentShardRequests not being set on Kibana for some reason (so when I got them talking, a silly query parameter was holding everything up) > - I wasted a ton of time due to some files from a failed installation causing an obtuse error -- I think it was a NoShardAvailableActionException

yeah, can't really help with either of these two - I already had a working ELK1.7 install, so for us it was pretty much a case of stand things up, and perform some modifications to templates/queries/etc, and off we went.

> Well, I had the advantage in that I already knew I wasn't touching it on Docker with a ten foot pole, and we use Ansible, so that made my google search pretty obvious.

But the thing is, docker shouldn't actually make things that much harder -- it's just the same old process + namespaces + cgroups. In theory not that much is different, I'm not sure why reality so often doesn't match up.

> Yep, that's a pain in the arse, and a trap for inexperienced players still.

Yeah I got mega trapped. At one point I started walking back versions, trying them in lockstep (to get away from the maxConcurrentShardRequests and the NoShardAvailableActionException issue, before I realized that the latter issue was due to stale data on disk). I started bouncing between docker repos for this stuff -- elastic stopped publishing to dockerhub, but there's images like blacktop/kibana and bitnami/kibana who that still exist. Once I try again with a clear head I'm sure it will be easier.

Yeah I actually filed a ticket on the maxConcurrentShardRequests thing -- it seems like a real bug and it's waiting for triage.

I just want to note that I was likely still exasperated from the defeat of not being able to install ES properly (which was likely my own fault as many others have been able to install it just fine), and this post should be taken with a grain of salt.

People are hard at work on ES and they're sharing their progress with the OSS community (the background behind X-Pack aside), and maintaining an OSS version and I'm grateful to them for that.

We rely on it - Elastic 6.2.2, logstash latest - we forgo kibana. But to be fair, we completely repackage this into our own dockers, to make life better.
How do you watch your logs? I couldn't for the life of me find an alternative to Kibana that interoperates well with ES.

Grafana should be possible, but it just seems like no one uses grafana for just plain log watching.

We have our own closed source product - it's a key part of what we're building.
I've spent at least 2 weeks this year trying to get Kubernetes and logstash/elasticsearch to work together with an endpoint. One week on getting the golang client to deal properly with the changing ip on a restarted elastic pod (solved), and one week on logstash doing it (unsolved), with x-pack mucking up things royally.

I wish I was angry, but I'm just defeated and annoyed.

I'm sure that you've seen this already but just in case you haven't:

https://github.com/kubernetes/kubernetes/tree/master/cluster...

I didn't follow it to the letter because I'm stubborn but

I have seen it, it doesn't address the problems with logstash, or the golang elasticsearch client.
Elasticsearch is a mess. It's so full of historical warts.

One major problem is that none of their documentation is actually reference documentation -- if you look for the formal schema (for things like mappings and the query DSL), the list of endpoints and their allowed parameters, the full list of settings etc., you won't find them listed anywhere. For example, does "keyword" mappings support the "enabled" property? What does the "index_options" setting actually do when combined with the "index" setting? Hard to tell any of this without trying them out. Turns out "dynamic_templates" mappings support any combination of the above, and will never complain about invalid combinations, whereas property mappings do. The whole environment variable vs Java property mess that you mention also exists.

They do deserve credit for trying to clean it up. The last few releases have been pretty brutal in how they've been deprecating (and later removing) legacy features and tightening the semantics, the newest and most dramatic of which is the deprecation of multiple type mappings per index. And they've been pretty good at explaining what's going to happen. So the warts are getting fewer. On the flip side, you have to follow the release notes religiously if you want to keep up to speed, since each release now tends to remove a bunch of features or add strict validation where there previously was none, and it becomes harder to upgrade. (If you want an important bug fix that hasn't been backported, things could get expensive.)

It's interesting how the Elasticsearch team let their focus be derailed by this new industry obsession with analytics and logs. It's not something ES was originally built for, and it turned out to be good at it mostly by accident. It's not terrible at it, but Elasticsearch shines the most for its original purpose, as a content index with rich full-text search capabilities. (Areas where it works less well include scaling edge cases such as high-cardinality aggregation buckets and high numbers of unique field names.) I wish they'd rather worked on things like joins and fixing the need for the "nested" object type, which is a ridiculous hack, but since those things aren't needed for analytics/logs, they haven't happened.

(Pet peeve time: One problem that rarely gets mentioned is that Elasticsearch's "eventually consistent" model has two parts. There's the part where replicas may be out of sync with primaries, but there's also the problem that on each individual node, index operations don't become visible to queries right away, not until the next segment "refresh", which by default happens every second. There's no API to ask about the refresh state, so right not the only way for a read followed by a write to be consistent is to ask the write to wait for refresh (or force a refresh), which is the opposite of what you want; the wait should be on read, not write. Given that ES now has a sequence number associated with shards, I'm surprised they haven't tied those numbers together with refreshes so you can ask about which sequence number the index is currently "at".)

So I think Elasticsearch is definitely ripe for disruption. I don't know of anything else that is able to compete at the moment, at least not in a single package; Solr isn't really in the same league.

One of my primary grievances with ES is that all security is a (paid) add-on. TLS, even most basic authentication, doesn’t come out of the box. I really expect that from a modern product. (Yeah, I know search-guard exists, it’s still an add-on)
It’s surprising when someone expects consistent ops from elastic when it’s built on something that has none of it (lucene).

At least solr doesn’t pretent to be something it isn’t (database).

>At least solr doesn’t pretent to be something it isn’t (database).

I worked on implementing ElasticSearch at my company and one of the things they mention clearly is ElasticSearch should NEVER be used as source of truth (primary database).

I think I understand where the OP was coming from.

ElasticSearch + Kibana often gets positioned and used as an open source alternative to Splunk. In that context it is in all respects operating as a primary database since often the source logs are transient.

Performance is also a black box. Super fast on small datasets but at scale.. better hope you can pay for that platinum support contract and be prepared to not use all the fancy features like collapse.
But scale is literally why people use ES right? it's expects a cluster almost out of the gate, I felt like I was using it wrong even trying to run only one instance on one machine.
Well, as a NoSQL search engine. But it implements sexy features with heavy performance penalties.
solr is pretty amazing and, for me, more accessible and easier to use
Yeah I copied and pasted the RPM snippets on their install package, installed them and had it up in no time.

This was a KVM VM though, not Docker, but I found the documentation was fine.