|
|
|
|
|
by bigzyg33k
247 days ago
|
|
RAG certainly doesn't reduce hallucinations to 0, but using RAG correctly in this instance would have solved the hallucinations they describe. The purpose of the system described in this post is OCR inaccuracies - it's convenient to use LLMs for OCR of PDFs because PDFs do not have standard layouts - just using the text strings extracted from the PDFs code results in incorrect paragraph/sentence sequencing. The way they *should* have used RAG is to ensure that subsentence strings extracted via LLM appear in the PDF at all, but it appears they were just trusting the output without automated validation of the OCR. |
|
I could be totally wrong here.