| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by vmarius 2174 days ago
	AFAIK Tesseract is trained to recognize characters and uses a bunch of steps to prepare image for recognition. Steps like removing noise, fixing contrast and resizing. It means that it performs not-so-good when for example image contains black text and white text on green background since this is not "normalized" through image preparation steps and it cannot detect white text on green background (but you can do it yourself)