Hacker News new | ask | show | jobs
by xvinci 789 days ago
I did evaluate a few vector databases for our RAG PoCs with quite a significant amount of metadata for permission handling on both the vector and the query, and execution time was in the area of milliseconds as far as I remember. The RAG performance hit pales in comparison to what computing time larger LLMs need, so I am not sure you are on the right track here.