|
I'm skeptical about some vector databases these days, but your article misses a few import points when it comes to LLMs. 1. To use LLMs effectively, you often need to generate and store more than 1 vector per document. 10 million vectors may only be 100,000 documents. This may still be enough for alot of small problems.
2. Pgvector currently has great limitations on recall/latency because underlying its ANN its using IVF (I'm currently working on adding HNSW-IVF and HNSW support to PGVector). In some cases, even elasticsearch can have issues with scale (the problem comes from the constraint of one ANN index per index segment, and immutability).
3. Pre-calculate seems like the wrong word to describe HNSW graph construction. I think a point you miss that is important to consider for LLM + vector DBs is the fact that so much of the complexity of these uses cases cannot be captured by the vector DB (e.g. pinecone, chroma, qdrant, etc). I think there are some more end to end systems, at least in search, attempting to solve this (e.g Marqo, maybe Weaviate). Overall, I like the article. It makes a worthwhile claim and counterpoint to all the vector DB hype. |
Disclaimer: I work for Qdrant, and we believe a database should be just a database. I remember attempting to move logic to the database layer and coupling neural encoders into the vector database sounds the same.