Hacker News new | ask | show | jobs
by planfaster 3903 days ago
I hope this is not too off-topic but since we are talking about book-scanning, how would I go about scanning my own books for private use?

For instance, if I have a bunch of books about drawing, I'd like to scan them all so that I later group all of the figure drawing pages in one folder, all the gesture drawing pages on another, etc, so they can be more easily used (and more useful) as reference.

Does anyone here recommend a way to scan books at home? I'm not against buying a contraption.

4 comments

There's an entire community dedicated to making inexpensive devices to scan books.[0] You will probably also have access to a commercial grade book scanner at any large library.

[0] http://www.diybookscanner.org/

I have done one or two books using the glass plate + camera method: camera mounted on a tripod, book opened up halfway, and a glass plate to hold the pages flat while taking the pictures. I think my workflow was 5ish seconds per page.

If I were to do more and had the space, I would have gone the diybookscanner.org route to improve the quality and processing rate. At one point I belonged to a hackerspace in Oakland that had one available.

Post-processing workflow was much easier, and involved using scantailor (awesome free software to batch align, crop, white balance, etc the pages) and then Acrobat for OCR.

I just got into book scanning ~6 weeks ago. I was partly inspired by the August HN discussion of Jason Scott's rescue mission of 25k manuals [0], and intrigued by Jason's kind warning to "the next person to mention the Linear Book Scanner (a prototype that destroys books)".

Emeritus community hero Daniel Reetz spent 6 years creating the "Archivist" scanner [1]. He and his collaborators have done a phenomenal job, and created some of the best documentation I've seen for any project (open-source or otherwise). The "Lessons Learned" front matter alone is inspiring [2].

So far I've found that book scanning is an ideal "DIY" project: enough hardware & software quirks that are gratifying to puzzle through, but nothing super difficult. In fact, it is exactly like building and calibrating a simple scientific instrument and learning to collect and process image data. To @planfaster or anyone who is considering book scanning for private use, definitely do it!

I highly recommend buying the "Archivist" scanner kit + electronics pack available at http://tenrec.builders/. There is ample hard-earned wisdom in the forums and tenrec supplemental docs about dozens of minor process details where you think "Why don't people just do X?" and it turns out X isn't ideal, and neither is Y, but Z works fine.

The main thing that I didn't consider before starting was that the scanner hardware only facilitates one very specific part of the workflow: taking pictures of flattened pages with (nearly) identical resolution and positioning. It's an important step, and reducing it to 5 seconds per page doesn't magically eliminate tedious downstream processing with other tools[3]. All that said, it's very rewarding, and really fun to start thinking about what you can do with scans, e.g. turn entire books into posters [4].

[0] https://news.ycombinator.com/item?id=10070529

[1] http://www.wired.com/2009/12/diy-book-scanner/

[2] http://www.diybookscanner.org/archivist/?page_id=25

[3] http://scantailor.org/

[4] https://twitter.com/smd4/status/655092522071420929