Cam you help me understand this? The research appears to be from a few years ago. Can this be used with Claude (for example)? How is it different to the approach many people are taking with vector stores and embeddings?
Other people seem to be suggesting that the user would do the retrieval of the relevant parts of the book from a vectordb first, and then feed those sections along with the question as the prompt. Conceptually it is very similar (and it too uses vector database), but with RAG it would happen as part of the inferencing pipeline and therefore achieve better performance than the end user emulating it.