Hacker News new | ask | show | jobs
by tehologist 79 days ago
Python pdftools to convert to images and tesseract to ocr them to text files. Fast free and can run on CPU.