Hacker News new | ask | show | jobs
by WalterGR 3398 days ago
Fairly frequently, OCR engines are posted here. But almost without exception, they lack layout analysis, which renders them largely useless.

Is this something that could be combined with those OCR engines? (e.g. TesseractOCR...)

3 comments

I would not call these services useless ;) - but I wonder the same... Some apis like https://ocr.space return the coordinates of each converted word. Can that be a used input? (I have not tried it yet)
ephesoft seems to use this for classifying and data extraction from documents.
some services allow you to set the layout manually: Docparser
PDF.co offline tool (for Windows) supports OCR and partial OCR for pdf to text and pdf to csv with layout preserved. (disclaimer: i work on it)