| Sorry for my ignorance. But memory refers to the process of using embeddings for QA right? The process roughly is: Ingestion: - Process embeddings for your documents (from text to array of numbers) - Store your documents in a Vector DB Query time: - Process embeddings for the query - Find documents similar to the query using distance from other docs in the Vector db - Construct prompt with format: """
Answer question using this context:
{DOCUMENTS RETRIEVED} Question: {question}
Answer:
""" Is that correct? Now, my question is, can the models be swapped easily? Or that requires a complete recalculation of the embedding (and new ingestion)? |