| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ping_pong 1798 days ago
	I used Tesseract almost 10 years ago to scan letters from a Words With Friends board. I was getting over 90% accuracy, but the letters with score values on them corrupted the letters and screwed up the detection. So I created a new "language" which Tesseract supports, that incorporated the score value corruption as part of the OCR translation. I got to over 98% accuracy with that which was about as good as I could get. Overall I thought it was great and I wonder how good it would perform these days with 10 years of improvements!