Hacker News new | ask | show | jobs
by 110 1055 days ago
Yeah being able to search and chat with PDF files is quite useful.

Khoj can index directory of PDFs for search and chat. But it does not currently work with scanned PDF files (i.e not with ones without selectable text).

Being able to work with those would be awesome. We just need to get to it. Hopefully soon

1 comments

Check pdftotext it's a CLI tool (maybe a library too) that makes pdf text selectable. Oh sorry, I meant to say ocrmypdf. But hey, maybe it's worth checking both.