Hacker News new | ask | show | jobs
by buckbova 4515 days ago
Is this accurate to elastic search since it is build on Lucene?

https://lucene.apache.org/core/

"index size roughly 20-30% the size of text indexed"

That seems excessive for an index.

1 comments

Not sure how that's calculated. I assume it is accurate, but the index size is going to depend a lot on what kind of text you have and how it is separated into individual terms (or n-grams or all the other ways you can tokenize and filter to create individual terms).

Personally, I think of disk space as cheap, and am far more concerned with having options to improve speed and quality of search results.