|
|
|
|
|
by RyanZAG
4515 days ago
|
|
Complexity. Having two copies of the data means more dev time, more resources required to shift the data around, etc. Having just 1 data store that can also handle all your searching is like the holy grail. As you say, not sure if Solr/ES/Lucene are there yet - but they're definitely very very close. There is no theoretical barrier either - it just comes down to closing bugs, and the ES/Lucene team are very good at closing bugs. EDIT: I don't think MongoDB is there yet either. There are definite benefits and drawbacks between Postgres and ES, tipping heavily towards Postgres for structured heavy write data. But for ES and MongoDB? I think MongoDB falls a bit short there. |
|
For example, Postgres lets you reason about integrity, atomicity and transactional boundaries, and whether things are really safely stored with synchronous replication. If Postgres returns after a commit, I trust it. However, that requires me to have two servers working, which is harder to keep highly available.
ZooKeeper, on the other hand, I can rely on being available. But that's not really something you want to be putting lots of load on, nor try to do anything but trivial "queries". And the more servers you add, the slower writes get.
I don't trust Elasticsearch enough for those tasks, yet I wouldn't want to do searches in Postgres (Yep, I'm familiar with tsearch) even though it can. Elasticsearch is simple to scale out and awesome for searching.
Logs and metrics we shove straight into Elasticsearch, however. Other things go from ZooKeeper to Postgres and then to Elasticsearch, or from just Postgres to Elasticsearch.
Separate tools for separate jobs. I'm one of the co-founders of www.found.no, one of the hosted Elasticsearch providers . We absolutely love Elasticsearch and find new use cases for it all the time, but it's not going to be the one store to rule them all, at least not very soon.