|
|
|
Hybrid search (BM25/vectors/RRF) barely improved over pure semantic
|
|
1 points
by pjmalandrino
63 days ago
|
|
My setup: ~600 technical docs (50 pages avg, lots of schemas/diagrams), chunked and embedded with BGE-M3, PgVector as vector DB. Semantic retrieval was ok but not great on our technical docs. Read everywhere that hybrid search with RRF was supposed to be the next level.
Implemented it, BM25 + vector + RRF fusion. Result: almost no improvement. Like, negligible. Am I missing something obvious? Is hybrid overhyped on technical docs with lots of schemas/tables or is my setup just broken? |
|