Hacker News new | ask | show | jobs
by munk-a 2098 days ago
PDFs can do all sorts of voodoo (like you can do with HTML if you hate the user's browser) to make legible content that is pretty illegible to machines - but most documents are produced by tools that have pretty sane outputs that can be reverse parsed to get a pretty nice HTML blob.