Hacker News new | ask | show | jobs
by hm-nah 745 days ago
You have the find a good OCR tool that you can run locally on your hardware. RAG depends on your doc processing pipeline.

It’s not local, but the Azure Document Intelligence OCR service has a number of prebuilt models. The “prebuilt-read” model is $1.50/1k pages. Once you OCR your docs, you’ll have a JSON of all the text AND you get breakdowns by page/word/paragraph/tables/figures/alllll with bouding-boxes.

Forget the Lang/Llama/Chain-theory. You can do it all in vanilla Python.