Hacker News new | ask | show | jobs
by joelhaasnoot 3104 days ago
It's a little clunky but here's the one I found best that just worked on Ubuntu: http://gscan2pdf.sourceforge.net/ . It can combine some of the best tools for OCR/cleanup/etc.

My main gripe is that I have a document feeder and manually selecting pages with shift to combine in to a single document and clicking "Save as" is far too much of a hassle. There needs to be a better flow for that.

1 comments

I wrote a collection of bash scripts for that. https://github.com/coaxial/insaned-config

It was initially to use with insaned, but I later came up with a script to tie it all together (scan.sh) because it's faster than jamming the scan button waiting for insaned to register. And with the script, I can queue commands provided I'm fast enough to swap the physical pages in the flatbed scanner.

It also uses the excellent textcleaner imagemagick script to clean up the scans and make them more ocr friendly.

The readme isn't totally up to date, parallel isn't required anymore, and there is no mention of the scan.sh script. But when you run it, it prompts for commands. You might need to edit the scripts to set your own output directories and textcleaner location.