Hacker News new | ask | show | jobs
by wxm 4209 days ago
Some companies buy up the rights to old/out of print books that created this original material. Then they cut it up and automatically digitise it. There was a post on HN quite some time back about somebody describing his process to do exactly this for a repair-your-car book - and how he used it to make money via ads through SEO hits.
1 comments

Do you remember about when this was? I've been searching for the post on HN and can't find anything.
I'm afraid not. I've been searching as well and couldn't find it. I recall his process to be

1. secure the rights to the book (he knew the author personally/through family)

2. cut the book open and run it through a high resolution scanner

3. use imagemagick to preprocess images

4. run OCR on the pages and convert them to markdown

5. have a compiler convert his markdown and images to HTML

EDIT: FOUND THE LINK: https://news.ycombinator.com/item?id=4974055

Thanks for tracking that down! Very interesting.