Hacker News new | ask | show | jobs
by rnd0 902 days ago
>I wouldn't use either of them for RAG

What's RAG?

3 comments

Since the models have a limited context size, you pre-process a bunch of data that might be related to the task (documentation, say) and generate a semantic vector for each piece. The when you ask a question, look up just the few pieces that are semantically most simlar and load them into the context along with the question. Then the LLM can generate a new answer with the most relevant pieces of data.
Retrieval augmented generative, basically giving it some text passage and asking questions about the text.
If you want more on RAG with a concrete example: https://neuml.hashnode.dev/build-rag-pipelines-with-txtai