Hacker News new | ask | show | jobs
by anotheraccount9 869 days ago
For a moment I wasn't sure if I wanted to click on the link.
2 comments

On an only slightly related note: is there any good way to check PDFs for malware/executables?

If I'm stuck with an attempt at it, the best I can think of is opening in a new QEMU or docker with no Internet access, but that's 1) a fair but of work to check something, and 2) probably not even that secure. Using some cli tool, like xxx, bat, or ranger, that does some processing to extract the text and looking at just that feels more secure - but I know it really isn't.

What is a simple tool to "clean" PDFs? An ML tool that does QEMU/docker/no-net to extract the content, turns that into game, and saves a typst/latex template with it would probably be the best possible outcome - but that's a decent (yet potentially very lucrative) task.

For analysis, I’ve used Didier’s tools. If you just want a safe way to open it, upload it to a cloud storage provider which destructively renders the pdf. Box or Google drive should work.

https://blog.didierstevens.com/programs/pdf-tools/

What you mean with "PDFs with malware/executables"?

If you're talking about embedded active content within them, then a reader application can just ignore/not run it.

If you're talking about a crafted PDF that exploits, let's say, font rendering bugs inside the reader than it's near impossible. Keep your applications updated.

There is a Chrome addon called SquareX https://chromewebstore.google.com/detail/kapjaoifikajdcdehfd... The founder is pretty reputed in the Cybersecurity field.
There are some pdf readers that protect you against those things.

On Android, for example, there is the GrapheneOS Pdf Viewer [1]. It's readme has a pretty good explanation of how it works.

1: https://github.com/GrapheneOS/PdfViewer

It also screams buffer overflow.
PDF readers are probably mostly pretty hardened against "naive" non-conforming content.
> probably mostly pretty hardened

Quite possibly perhaps that might be true-ish to some extent, I think, but take that with a grain of salt, I'm not an expert, that's just my wild guess :-p

It's pretty ridiculous to peel that off the following qualifier.

Readers have been aggressively attacked for a long time. It's certainly not impossible that some basic demonstration PDF will cause an issue, but it's probably not reasonable to expect it.