Hacker News new | ask | show | jobs
by throw_me_up 1567 days ago
Theoretically ES can scale that big, but isn't easy and takes a large team to manage. Just like at Google.

ES is a toolbox not something you can just use off the shelf. I think Elastic likes to make people think it is a off the shelf solution, but the defaults aren't great for many use cases. However ES can include pagerank, ML, BM25, etc in the ranking calculation but it requires search relevance expertise to make it all work together for any particular use case. And different use cases will need different ranking equations.

1 comments

Yes of course. What I meant is - when I query a phrase, that phrase can be found in one million webpages, yet I get a bunch of them sorted by relevance. Surely that is a combination of two things - deep rooted crawling that gathers data from most websites, and secondly, a nice algorithm to sort them by relevance that is based on a variety of signals. ES has nothing to do with crawling, that is custom to the user using ES, but for the content fed to ES, how much does it allow customizing signals, combining them into a custom relevance logic, and how much does it allow to modify and edit the indexing logic so that say I can use a combination of BM25 and PageRank?
It is very customizable, but signals like PageRank are best calculated outside ES and included as a field in your document.