| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jonatron 1508 days ago
	For “text In the wild” or scene text, the last time I checked, EasyOCR and PaddleOCR were both good.

2 comments

danShumway 1508 days ago

I expected these to still be pretty low quality, but surprisingly some quick tests shows that EasyOCR seems to be doing relatively decently at pulling text out of smartphone pics of documents.

Thanks for sharing these -- it's maybe just my very bad searching skills but I had been trying to set some stuff up with Tesseract and had come to the conclusion that I just couldn't use it for document photos and would either need to abandon that effort and buy a faster scanner, or hook into some proprietary service like Google/Apple.

Both of these look really promising, so now I'm excited again about the potential of setting up a fast Open Source way to digitize my documents.

link

elpakal 1508 days ago

Just IMHO Apple's Vision framework has been great too, and very easy to get started

link

dangledangle 1508 days ago

Vision's rectangle detection or document scanner has worked well for us but in comparison to what Google's MLKit OCR offers it pales in comparison. MLKit OCR also does language detection + more languages out of the box.

EasyOCR is definitely interesting and something that's worked well for us at a prototyping level.

link