|
|
|
|
|
by softwaredoug
1670 days ago
|
|
As author of Relevant Search and contriburor of AI powered search I endorse this :) Relevance is really subjective, domain specific, requires intense amount of measurement and testing and many different ranking signals. Lucene is a toolbox for crafting many of these signals. |
|
For example, he says
> However, as you can see, this vector space model does not explicitly require a higher ranking document to contain more query terms than a lower ranking one.
Well the way you get a higher similarity in a vector-space model is matching more terms. The caveat being that IDF and field length makes you also consider a term's specificity. So if you search for 'luke skywalker' you care more about the 'skywalker' match than the 'luke' match. But a match on BOTH 'luke skywalker' would score higher (field lengths being constant)