| To make a LLM relevant to you, your intuition might be to fine-tune it with your data, but: 1. Training a LLM is expensive. 2. Due to the cost to train, it’s hard to update a LLM with latest information. 3. Observability is lacking. When you ask a LLM a question, it’s not obvious how the LLM arrived at its answer. There’s a different approach: Retrieval-Augmented Generation (RAG). Instead of asking LLM to generate an answer immediately, frameworks like LlamaIndex: 1. retrieves information from your data sources first, 2. adds it to your question as context, and 3. asks the LLM to answer based on the enriched prompt. RAG overcomes all three weaknesses of the fine-tuning approach: 1. There’s no training involved, so it’s cheap. 2. Data is fetched only when you ask for them, so it’s always up to date. 3. The framework can show you the retrieved documents, so it’s more trustworthy. (https://lmy.medium.com/why-rag-is-big-aa60282693dc) |