| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by notglossy 86 days ago
	You will still get hallucinations. With RAG you use the vectors to aid in finding things that are relevant, and then you typically also have the raw text data stored as well. This allows you to theoretically have LLM outputs grounded in the truth of the documents. Depending on implementation, you can also make the LLM cite the sources (filename, chunk, etc).

2 comments

thisischayan 83 days ago

The approach that has worked for us in production is correction during generation, not after.

The model verifies its output against the rules in the prompt as it generates and corrects itself within the same API call — no retries, no external validator. If there are still failures the model cannot fix at runtime, those are explicitly flagged instead of silently producing wrong output.

This does not mean hallucinations are completely solved. It turns them into a measurable engineering problem. You know your error rate, you know which outputs failed, and you can drive that rate down over time with better rules. The system can also self-learn and self-improve over time to deliver better accuracy.

link

tren_hard 86 days ago

I’m still learning this advantages and differences between them, would there be benefits to SFT and RAG? Or does RAG make SFT redundant?

link

notglossy 86 days ago

I think generally, SFT is like giving the LLM increased intuition in specific areas. If you combine this with RAG, it should improve the performance or accuracy. Sort of like being a lawyer and knowing something is against the law by intuition, but needing the library to cite a specific case or statute as to why.

link

tren_hard 85 days ago

Thank you I appreciate the reply and that analogy helps make sense of this.

link