| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by lhuser123 1282 days ago
	It was my experience that OCRing scanned PDFs, would result in many small errors. For example “Alt” could be interpreted as “A\|t”. Did you had those problems? How did you fixed it? What about other languages?

1 comments

jcuenod 1282 days ago

I didn't build my own OCR models, in the beta I'm using tesseract but I'm going to use google or amazon when I start charging. There's no way to compete on OCR quality but I don't see other products automatically fixing doc scans, which is the value add I see my software really giving...

link