Hacker News new | ask | show | jobs
by thesandlord 768 days ago
We use GPT-4o for data extraction from documents, its really good. I published a small library that does a lot of the document conversion and output parsing: https://npmjs.com/package/llm-document-ocr

For straight OCR, it does work really well but at the end of the day its still not 100%

1 comments

Thanks! look forward to checking this out as soon as I get home.