|
|
|
|
|
by notglossy
86 days ago
|
|
You will still get hallucinations. With RAG you use the vectors to aid in finding things that are relevant, and then you typically also have the raw text data stored as well. This allows you to theoretically have LLM outputs grounded in the truth of the documents. Depending on implementation, you can also make the LLM cite the sources (filename, chunk, etc). |
|
The model verifies its output against the rules in the prompt as it generates and corrects itself within the same API call — no retries, no external validator. If there are still failures the model cannot fix at runtime, those are explicitly flagged instead of silently producing wrong output.
This does not mean hallucinations are completely solved. It turns them into a measurable engineering problem. You know your error rate, you know which outputs failed, and you can drive that rate down over time with better rules. The system can also self-learn and self-improve over time to deliver better accuracy.