Hacker News new | ask | show | jobs
by vectorrain 705 days ago
What are the advantages and potential challenges of combining dense vectors, sparse vectors, and full-text search in a hybrid retrieval method, as implemented in Infinity v0.2, and how does this approach compare to traditional vector search or other retrieval methods?
1 comments

A noticeble work to demostrate the effectiveness of hybrid search is blended rag by IBM research (https://arxiv.org/abs/2404.07220), which has shown that 3-way hybrid search could achieve STOA over multiple evaluation datasets. And also, we've reproduced the results of blended rag, as shown in this article. Additionally, blended rag + colbert based reranker could have a much better results.

The major challenges are how to implement and manage such many indices within single database. That's why we build this database start from scratch. Infinity is actually a kind of "indexing" database, based on a columnar store. The executor also requires refined design to fuse these hybrid search approaches effectively.