Hacker News new | ask | show | jobs
by niftich 3596 days ago
Recently, Dropbox wrote about dewarping prior to OCR in their app: https://news.ycombinator.com/item?id=12297944

This code had the same idea, and is open-source!

2 comments

No, what this code does is much more sophisticated - Dropbox do not dewarp (i.e. remove non-linear distortions) they only transform the image to make the document rectilinear (leaving the distortions/deformations intact), which is much simpler.
Dewarping images for OCR isn't a new idea at all - while older systems used to do simple de-skew (tilting the image) for speed reasons, the need for warping the image to improve results has been known for a very long time, and most production quality OCR engines has done something like it for a long time.

(this isn't to slight the article - it's a great, well written presentation on how to implement it)