Hacker News new | ask | show | jobs
by neilv 1426 days ago
Years back, I raised how evolved Ghostscript had been over a very long time, together with the huge complexity of the PDF specs, as a potential source of vulnerabilities.

(But maybe wasn't as much on people's radars, with all lower-hanging fruit of other technology choices and practices going on, outside of PDF.)

New code for a large spec is also interesting for potential vulns, but maybe easier to get confidence about.

One neat direction they could go is to be considered more trustworthy than the Adobe products. For example, if one is thinking of a PDF engine as (among other purposes) supporting the use case of a PDF viewer that's an agent of the interests of that individual human user, then I suspect you're going to end up with different attention and decisions affecting security (compared to implementations from businesses focused on other goals).

(I say agent of the individual user, but that can also be aligned with enterprise security, as an alternative to risk management approaches that, e.g., ultimately will decide they're relying on gorillas not to make it through the winter.)

3 comments

Is there any work in this space on some oddball "contamination protocol" type of security? Like you would assume everything is contaminated and you do things that eliminate the potential for cross contamination entirely, like they do in lab settings with aseptic technique. In this case, it could mean printing out the contaminated pdf on a system you don't care about being contaminated, then scanning it with an airgapped scanner to recover a 'sterile' pdf. It seems convoluted but I'm sure for some applications that could be a good solution that requires no improvement to pdf protocol.
I've heard of measures like that, including for the other direction (i.e., redacting documents without leaking information in the effectively opaque PDF format).

IMHO, having well-engineered tools handle data, and being conservative about the trust/privileges given externally-sourced data is at least complementary to the current "zero trust" thinking among networks and nodes.

(Example: Does your spreadsheet really arbitrary code execution, in an imperfect sandbox, for all your nontechnical users? Should what people might think is a self-contained standalone text document file really phone home, to disclose your activity and location, or have the potential to be remotely memory-holed/disabled, along with attendant added security risks from that added complexity and the additional requirements it puts on host systems/tools to try to enforce that questionable design?)

There are two relevant computer security ideas here -- "sandboxing" is used to place risky work (such as Chrome decoding some media) into an isolated process which lacks privileges to e.g. abuse access to files or networking, and "taint tracking" is used to reason about what attacker-supplied input can influence.
DARPA is funding fundamental research in this space, specifically through programs like SafeDocs[1].

[1]: https://www.darpa.mil/program/safe-documents

Qubes OS can do that. It basically starts a disposable vm just for printing the PDF.
But why is the doc running as our user anyways? I didn't create the documebt so it doesn't make sense that it runs with the rights of my user. It can certainly ask for certain permissons.

Zero days will alwsys exist it seems, even Chrome has these, with hundreds of security researchers eyes on it

More trustworthy thank Adobe...

Not hard