Hacker News new | ask | show | jobs
by gibrown 4515 days ago
There are a lot of features thoughtfully combined that make ES great. Top of my list would be:

1. It handles human written language. Any language. The same technology that let's it handle strings written in human language provides a lot of flexibility in handling string in other applications. Particular when handling logs.

2. Non-string data it also handles very fast and cleanly (numbers, dates, geo).

3. Lucene has an inverted index that has been optimized over many years. ES scales that pretty seamlessly across many servers. All decisions in the project seem to be made around whether a feature can scale to 100s of nodes.

The devs have also been really smart to focus on the "out of box experience". Very well thought out defaults.

More on our experience with ES at scale: http://gibrown.wordpress.com/2014/01/09/scaling-elasticsearc...

1 comments

Is this accurate to elastic search since it is build on Lucene?

https://lucene.apache.org/core/

"index size roughly 20-30% the size of text indexed"

That seems excessive for an index.

Not sure how that's calculated. I assume it is accurate, but the index size is going to depend a lot on what kind of text you have and how it is separated into individual terms (or n-grams or all the other ways you can tokenize and filter to create individual terms).

Personally, I think of disk space as cheap, and am far more concerned with having options to improve speed and quality of search results.