Hacker News new | ask | show | jobs
by reallycurious 4287 days ago
is this better than the terrassect OCR?
3 comments

I think that's a sort of apples and pears type of comparison.

Tessarect can be used everywhere, and is used dominantly on open platforms. This is a offering from Microsoft to be used on their platform only.

They may both be good, but they have widely different platform targets.

My guess is he meant better at actually OCR'ing text, not better for implementation.
what are you talking about? it is always about the results. OCR is a tool and it doesn't matter if runs on windows, linux, osx, phone, tablet, watch. if this microsoft OCR produce better results than terrassect, than people will simply create service running on windows (yes even on windows phone) and some kind of API to talk to it. the questions remains the same: does it produce better results than terrassect?

so far, this microsoft OCR is just bunch of words without any prove that it actually works, what so ever. show me some pictures or videos of results.

Thats the big question. Tesseract is pretty good, though quite slow I must say.
It depends on what is being scanned. Say you have a perfectly formatted image, directly taken from a scanner, it's a pretty darn quick process.

But from my experience, what adds to the slowness is pre-processing the image to make it suitable for OCR, especially tesseract. I still haven't found the magic combination of filters because every image is different, especially if your source them from users camera phones.