Hacker News new | ask | show | jobs
by ganeshkrishnan 3569 days ago
Neither indexing nor querying does the ranking. Ranking is done after indexing and can be either tf-idf , pagerank or combination of that. Once the document similarity to the query is calculated, by for example vector space model, the documents are ranked by pagerank.

What OP is saying that instead of pagerank we can have other ranking methods which is surely plausible.

1 comments

Sure, but what I was saying was that what good is a new ranking method, when you only have at your disposal the same set of metrics as the method you are trying to replace? A new ranking would quite often mean adding new metrics. For example, when Lucene when from tf-idf to bm25 they added lots of new metrics to be able to cater for the new algorithm.
did lucene go from tf-idf to okapi bm25? Surprising. Need to research it up.

We use tf-idf too but augment with page rank and clustering. gets more relevant docs