Hacker News new | ask | show | jobs
by sciencemadness 1435 days ago
The OCR text beneath the page image is there to make it easier to search. I used ABBYY FineReader for the OCR process. I didn't do any manual reviewing or correcting of the automatically generated OCR text. I ran additional scripted tools of my own to optimize the PDF that I generated from FineReader for small file size.