| HN Mirror

Sorry for my ignorance. But memory refers to the process of using embeddings for QA right?

The process roughly is:

Ingestion:

- Process embeddings for your documents (from text to array of numbers)

- Store your documents in a Vector DB

Query time:

- Process embeddings for the query

- Find documents similar to the query using distance from other docs in the Vector db

- Construct prompt with format:

""" Answer question using this context: {DOCUMENTS RETRIEVED}

Question: {question} Answer: """

Is that correct? Now, my question is, can the models be swapped easily? Or that requires a complete recalculation of the embedding (and new ingestion)?