|
|
|
|
|
by mark_l_watson
124 days ago
|
|
I have been using vector based RAG for about two years now, I am not knocking the tech, but last year I started experimenting with going way back in time and also in parallel trying BM25 search (or hybrid BM25 and vector). So: not even a very good example use case of LLMs, the tech is not always applicable. EDIT: I am on a mobile device and don’t have a reference handy but there have been good papers on RAG scaling issues - basically the embedding space gets saturated (too many document chunks cluster in small areas of the embedding space), if my memory is correct. |
|