Hacker News new | ask | show | jobs
by pyth0 1131 days ago
Other people seem to be suggesting that the user would do the retrieval of the relevant parts of the book from a vectordb first, and then feed those sections along with the question as the prompt. Conceptually it is very similar (and it too uses vector database), but with RAG it would happen as part of the inferencing pipeline and therefore achieve better performance than the end user emulating it.
1 comments

Yep, but your retrieval from the vector DB becomes your relevancy bottleneck.