|
|
|
|
|
by raunakchowdhuri
1015 days ago
|
|
For sure! When you send an OpenAI request, after a delay (to ensure the user doesn't keep chatting in the same session), a secondary GPT 3.5 call is made to "autosave" the result. This GPT call gets the information from the current chat session as well as other similar entries in the vector database. The structured output from this call is used to do either insert a new memory or update an existing memory in the vector database. At query time, we do a search of the vector database and quickly insert relevant context into the system prompt. I like to consider this approach "dynamic" retrieval augmented generation, as the vector database is constantly changing as conversations occur. |
|