| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ranger_danger 717 days ago
	There is a whole page on their site dedicated to methods for improving the accuracy: https://tesseract-ocr.github.io/tessdoc/ImproveQuality.html I think most frontends to tesseract employ a lot of these methods and maybe more... but trying to use tesseract directly can indeed be difficult without extra processing of the image first.

1 comments

oefrha 717 days ago

I know, I tried many things with the photo collection I was working with, including advice from that very page, generally to relatively poor results. (I ended using Apple’s framework on macOS.) The point is tesseract is definitely not “smarter” in any way, at best it’s on par with Apple’s OCR when you hand it very clean text.

link