| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by tslmy 802 days ago

To make a LLM relevant to you, your intuition might be to fine-tune it with your data, but:

1. Training a LLM is expensive.

2. Due to the cost to train, it’s hard to update a LLM with latest information.

3. Observability is lacking. When you ask a LLM a question, it’s not obvious how the LLM arrived at its answer.

There’s a different approach: Retrieval-Augmented Generation (RAG). Instead of asking LLM to generate an answer immediately, frameworks like LlamaIndex:

1. retrieves information from your data sources first,

2. adds it to your question as context, and

3. asks the LLM to answer based on the enriched prompt.

RAG overcomes all three weaknesses of the fine-tuning approach:

1. There’s no training involved, so it’s cheap.

2. Data is fetched only when you ask for them, so it’s always up to date.

3. The framework can show you the retrieved documents, so it’s more trustworthy.

(https://lmy.medium.com/why-rag-is-big-aa60282693dc)

1 comments

choilive 802 days ago

This is the state of LLMs today - it is likely that we will have models in the future that can do some form of "online" training - or new training methods that aren't nearly as compute intensive. There are many people working on these scaling issues with LLMs today. We already have new attention heads that work around the quadratic time and space complexity of the input prompts.

link