If you are using OpenAI, the new Assistants API looks like itnwill handle internally what you used to handle externally with a vector DB for RAG (and for some things, GPT-4-Turbo’s 128k context window will make it unnecessary entirely.) There are some other uses for Vector DBs than RAG for LLMs, and there are reasons people might use non-OpenAI LLMs with RAG, so there is still a role for VectorDBs, but it shrunk a lot with this.
It’s more reliable than chatpdfs that relies on vector search. With vector db all you are doing is doing a fuzzy search and then sending in that relevant portion near that text and send it to a LLM model as part of a prompt. It misses info.
OpenAI charges for all those input tokens. If an app requires squeezing 350 pages of content in every request is going to cost more. Vector DB still relevant for cost and speed.