|
|
|
|
|
by spencerchubb
747 days ago
|
|
I'm curious, in your benchmark, what's the difference between BM25+Embedding and Embedding+BM25? And what do you use to make the embedding If you make the embedding with an LLM, it should work for any language the LLM is trained on. |
|
For my tests, I used Ada-002. As data I used small news articles and no chunking and no preprocessing. The query for the articles is embedded directly.
Of course, improvements can be done for both approaches. That should just exemplify, what you might expect with hybrid search.