Hacker News new | ask | show | jobs
by driscoll42 1088 days ago
It depends on your setup and use cases. There's three major considerations:

* What language are you trying to OCR? And only language or also things like math symbols? * Do you have a GPU or not? * Are you trying to OCR handwriting or typed words?

I explored OCRing English documents from the 1960s that were primarily typed, though some handwriting. I tried out PaddleOCR, TrOCR, Tesseract, EasyOCR, and kerasOCR for FOSS, and then Google, Amazon, and Microsoft for paid.

To be clear, the paid solutions beat the FOSs ones handsdown, no question. However for FOSS I found that TrOCR was the best for both typed and handwritten, however for typed, it was closely followed by tesseract, but for handwriting TrOCR was by far the best with all the others basically being worthless. However, TrOCR took ~200x longer even on GPU than Tesseract on CPU (Tesseract if fastttt, even more if you parallalerize it). Tesseract isn't the best, but it's the best all around, it's the one the Internet Archive uses.

Need to write up a blog on this. And the docTR looks interesting, I'm going to check that out.