|
|
|
|
|
by anon373839
818 days ago
|
|
This is interesting. I recently built a search tool that needed to locate documents by keyword or by semantics, so I implemented a hybrid search straight away: BM25 + embeddings (from `gte-base`), with a cross-encoder for reranking. I found that the lexical search was adding nothing; the embeddings alone produced almost identical results for keyword queries. (The re-ranker, however, made a big difference.) Is this unusual? |
|
If you look at the table in the section "3. Hybrid Retrieval brings out the best of Keyword and Vector Search" of that article, we shared there the significant variability of metrics as a function of query types.