Hacker News new | ask | show | jobs
by jfoutz 2669 days ago
You might say, if you can identify and simulate all cases of real life degradation, your problem is basically solved, just reverse the simulation on your inputs.

I’m not saying ocr isn’t hard. I’m saying normalizing all those characters basically is the problem.

1 comments

This isn't quite true if e.g. there are degenerate cases.