| The sibling comments seem to be correct in their technical explanations, but miss the meaning I'm getting from your question. My understanding is you want to know "are vector DBs compatible with specific LLMs, or are we stuck with a specific LLM if we want to do RAG once we've adopted a specific vector store?" And the answer to that is that the LLM never sees the vectors from your DB. Your LLM only sees what you submit as context (ie the "system" and "user" prompts in chat-based models). The way RAG works is: 1 - end-user submits a query 2 - this query is embedded (with the same model that was used to compile the vector store) and compared (in the vector store) with existing content, to retrieve relevant chunks of data 3 - and then this data (in the form of text segments) is passed to the LLM along with the initial query. So, in a sense you're "locked in" in the sense that you need to use the same embedding model for storage and for retrieval. But you can definitely swap out the LLM for any other LLM without reindexing. An easy way to try this behavior out as a layperson is to use AnythingLLM which is an open-source desktop client, that allows you to embed your own documents and use RAG locally with open-weight models or swap out any of the LLM APIs. |