Hacker News new | ask | show | jobs
by Rygian 1349 days ago
This is almost identical to a problem I'm trying to solve, which is to turn a potato-quality picture of a sheet of paper into a clean scan, turning whatever levels of gray conform the paper background become a uniform #ffffff white. The obvious solutions (equalizing, converting to bitmap, …) don't work because what's white in the top left (say #ccc) is wildly different from what's white in the bottom right (say #888), and the shift is non-uniform due to potato-quality lighting.

Glad I caught this post, I hope the solution can contribute to my problem (although I do not have a way to obtain a fixed ground truth — lighting will change for each picture.)

5 comments

Sounds like local contrast adjustment. There are several different ways of solving it, here's a couple that look like they work pretty well:

https://stackoverflow.com/questions/63251089/how-to-do-a-loc...

https://stackoverflow.com/questions/65666507/local-contrast-...

One possible preprocessing step could be to do a high pass filter on it, if the shadows vary slowly over the image.

There are also more specialized techniques specifically for removing shadows from documents, like these:

http://civc.ucsb.edu/graphics/Papers/ACCV2016_DocShadow/

https://faculty.iiit.ac.in/~vgandhi/papers/shadow_removal_ca...

I also found this, an image editor based approach if you just want to do a few images manually:

https://janithl.github.io/2021/12/remove-shadows-and-uneven-...

Thanks! These pointers will be very helpful.
If your content is black and white, just use a local contrast filter and then threshold it. It's easy to do, but it does result in monochrome so you lose antialiasing. If you're at 300+ dpi though that doesn't usually matter. This is commonly done with PDF scans where monochrome output is desired for high compression. Easy to do with ImageMagick.

If you want to preserve aliasing and also color generally, I'm sadly not aware of any open source solution for that. Various scanner apps seem to do it with varying degrees of success; I'd be curious if there's a standard algorithm for it. It feels related to the de-curving algorithms that take a book page and make it flat. So you'd be modeling both the page curvature and black/white values simultaneously. Seems possible for general lighting/shadow, but wouldn't work for reflectivity from camera flash.

Just so you know, there are many scanner apps which solve this problem already, not sure how many of them are open source though.
A common alternative solution in this case would be to use an adaptive thresholding technique such as Otsu’s method.
Otsu's method finds a single threshold value in an adaptive way. This can't solve a scanned document thresholding problem.

btw: The iOS Notes app has quite a capable document scanning tool. It's cleverly hidden though.

I'm pretty happy with the scanning feature of Evernote, and I think there are some other nice apps in the Android app store, but my goal is to have a solution that is not captive (either to a specific vendor or to a SaaS solution).
Thanks, I was not aware of Otsu's method.

From the Wikipedia article, "Otsu's method performs badly in case of heavy noise, small objects size, inhomogeneous lighting and larger intra-class than inter-class variance." (Emphasis mine.)

Right now my solution is at the stage of local thresholds with a configurable block size.

Thanks to your pointer, I know now that my next steps will be to review the Niblack or the Bernsen algorithms. (Or just integrate ImageJ.)