Hacker News new | ask | show | jobs
by Maakuth 4314 days ago
This converter is reasonable: http://www.willus.com/k2pdfopt/ I wonder if someone has done a Calibre plugin for it.
4 comments

> "K2pdfopt works by converting each page of the PDF/DJVU file to a bitmap and then scanning the bitmap for viewable areas (rectangular regions) and cutting and cropping these regions and assembling them into multiple smaller pages without excess margins so that the viewing region is maximized. Making use of this method, k2pdfopt can re-flow text lines, even on scanned documents"

Looks promising. Hopefully this would also remove javascript and executable code from the source PDF, although any exploits may run within the context of the converter. To be safe, conversion could be run from a livecd.

More information on analysis of PDF malware: http://blog.didierstevens.com/programs/pdf-tools/

PDF malware can be used for economic espionage targeting commercial research. What would help is a single open registry which has: bibliographic metadata + hash of known-good PDF for each paper.

Hey, that's pretty neat, I was just thinking it shouldn't be that hard to do something like that. I would love to be able to read academic papers on my Kindle Paperwhite, this might help with that. Reading on a regular tablet is a bit annoying at times.
I've used k2pdfopt for reading two-column formatted academic papers on Kindle Paperwhite, it works great.
If you use Mendeley for organising your papers check out KinSync.com. Pretty good for this.
Thanks for the information. This looks pretty good, will give it a whirl - glad to find it already on the ArchLinux AUR.

Edit: I gave it a test run, and found it does the job very well. Thank you again!

thanks for sharing. I didn't know that one. Before I've been using briss

http://sourceforge.net/projects/briss/