Hacker News new | ask | show | jobs
by jimrandomh 3334 days ago
When software OCRs a PDF, it does it by adding an invisible text layer aligned with the original text, while leaving the original text visible. This makes the PDF searchable, without having to worry about changing the font, introducing OCR errors where people can see them, or disturbing the background. What we see here is very unambiguously the result of a PDF-editing program, not a scan+OCR.