Hacker News new | ask | show | jobs
by djhworld 992 days ago
My solution was pretty much the same as what this guy did, although he had a slightly different model of scanner to me, but it's a very similar setup

https://chrisschuld.com/2020/01/network-scanner-with-scansna...

1 comments

I started with a script similar to the one you're using (though hand-crafted) with my ScanSnap S1500 (though I have mine run the PDF conversion in the background so I can immediately scan another document without having to wait - this is easy to do now with scanpdf). I've been doing this for about 12 years now, originally manually sorting into directories and using "pdfgrep" to find stuff but more recently I've put everything into a paperless-ngx instance (gradually tagging all the old documents).

I've switched my hand-crafted scripts recently to use scanpdf[1] which seems to give better results (once I tweaked it to be a little less eager to downconvert to B+W). I experimented with using OpenCV models for cropping and straightening (based on examples in a stackoverflow thread at [2]) but I found results were worse than scanpdf so far.

1. http://badge.fury.io/py/scanpdf 2. https://stackoverflow.com/questions/28935983/preprocessing-i...