Hacker News new | ask | show | jobs
by filoeleven 442 days ago
I doubt they're using OCR. More likely they're using one of the many text extractors available for PDFs.

https://stackoverflow.com/questions/3650957/how-to-extract-t...