Hacker News new | ask | show | jobs
by nickpsecurity 825 days ago
Machines and humans can both easily use HTML/XML. Extracting information from PDF’s is so much harder that there’s deep learning products dedicated to doing it. They still make mistakes, too.

I’d much rather have something akin to the CHM files where everything I need is in one file, easy to analyze, and has good readers.