Hacker News new | ask | show | jobs
by kaiabwpdjqn 2293 days ago
> It took hundreds of hours of human review time to find a single OCR mistake from this process!

This stands out to me as improbable. Not in that the error rate could be that low, but in that they actually had humans spend hundreds of hours checking the accuracy of difficult character recognition. How did that happen?

2 comments

Put a handful of grad students in a room for a week and you have hundreds of hours right there.
I searched out the article: "Reading Chess", 1990, HS Baird and Ken Thompson. (Yes, that Ken Thompson).

http://doc.cat-v.org/bell_labs/reading_chess/reading_chess.p...

It doesn't actually quantify the human proofreading time. I might have recalled incorrectly; I heard about this in the late 1990's as a war story from another OCR researcher.

It's an embarassing problem to have a system with "accuracy to high to measure"!