Hacker News new | ask | show | jobs
by alecco 4515 days ago
Why is it awesome? Why "it just works"? Is it just a mongodb-kind document store over Hadoop+Lucene?

What makes it so special to have hundreds of votes and tweets all around within 2 hours?

I don't understand. A DB engine engineer.

2 comments

There are a lot of features thoughtfully combined that make ES great. Top of my list would be:

1. It handles human written language. Any language. The same technology that let's it handle strings written in human language provides a lot of flexibility in handling string in other applications. Particular when handling logs.

2. Non-string data it also handles very fast and cleanly (numbers, dates, geo).

3. Lucene has an inverted index that has been optimized over many years. ES scales that pretty seamlessly across many servers. All decisions in the project seem to be made around whether a feature can scale to 100s of nodes.

The devs have also been really smart to focus on the "out of box experience". Very well thought out defaults.

More on our experience with ES at scale: http://gibrown.wordpress.com/2014/01/09/scaling-elasticsearc...

Is this accurate to elastic search since it is build on Lucene?

https://lucene.apache.org/core/

"index size roughly 20-30% the size of text indexed"

That seems excessive for an index.

Not sure how that's calculated. I assume it is accurate, but the index size is going to depend a lot on what kind of text you have and how it is separated into individual terms (or n-grams or all the other ways you can tokenize and filter to create individual terms).

Personally, I think of disk space as cheap, and am far more concerned with having options to improve speed and quality of search results.

distributed/full-text-search(many-many-options)/highlighter/compressed/geo-queries/searching on multiple indexes(databases)|types(tables)/distributed-aggregation/distributed faceting/very-fast-in-memory-suggester/inverse-query(percolator)where you register queries(like rows), and then test documents if they match queries

and many other stuff