Hacker News new | ask | show | jobs
by greesil 2256 days ago
Does finding weird and unexpected ways to do computation always imply a security risk?
4 comments

I do research in this space professionally.

The answer is not always, but sometimes: discovering unintended states or transitions in the execution contract of a program is a common building block for exploits. However, proving that the execution contract of a program can be coerced into representing computations in a TC language doesn't necessarily prove that you can do anything interesting.

Complex formats like PDF are a good example of this: you can probably contrive a PDF input such that the parser's state represents a minimal (and TC) language while interpreting it (e.g. a language with a few mathematical operators and a "store" primitive), but that doesn't magically get you network access or arbitrary memory read/writes. You need to show that said language, when programmed in, can affect the overall execution contract in a way that violates high-level assumptions.

Some resources (FD: my company's) on the subject:

* https://blog.trailofbits.com/2019/11/01/two-new-tools-that-t...

* https://blog.trailofbits.com/2018/10/26/the-good-the-bad-and...

Just a heads up: I think the second post has a typo; the code has named "new_item" but is referred to as "item" throughout. (I'm also not sure I understand the safety added by dynamic_cast.)
Thanks for the heads up! I'll fix it.
I’m not sure why that’s explicitly mentioned, I believe that for analyzing security, TC is a bit of an academic red herring. (Warning: not an expert.) There are two kinds of security implication to consider here I think.

One, can certain input data cause a large usage of computing resources (processing time or memory)? Non-TC mechanisms have repeatedly failed this test: ZIP bombs and XML entity bombs are notorious examples. On the other hand, you can easily put a resource cap on an interpreter of a theoretically TC language and be safe.

Two, can the untrusted code access resources that it shouldn’t (memory, files, sockets)? That’s mostly a quality of implementation issue, not one of Turing completeness. JavaScript interpreters have certainly been vulnerable to various exploits, but so have JPEG decoders. I don’t think TC is the issue here. (However, this is complicated a bit by side-channel attacks à la Spectre. I’m not sure how TC factors into those.)

In summary, I’m not convinced that Turing completeness is all that relevant for security. Am I wrong here?

Both can have exploits as you've shown, but you haven't addressed the difficulty of securing one or the other. Turing complete systems have a dramatically larger state space than non turing complete implementations and so at a fundamental level are more difficult to secure.
I simply don't think Turing completeness is the deciding factor in determining that difficulty. Hopefully someone can define this more clearly, but in my opinion, the issue is not abstract "state space", but implementation complexity and attack surface.

For example, it's a lot easier to write a secure Brainfuck interpreter (a very simple Turing complete language that can only access STDIN and STDOUT) than a program that securely extracts a ZIP file to disk.

At least theoretically, it means certain inputs can cause infinite loops, which, thanks to Turing completeness and the undecidability of the halting problem, means there’s no way to detect in advance in all cases.
It means you can't easily (or statically) reason about the output given arbitrary (user) input. So yes, it makes it much more likely that security bugs are introduced.
Also it at least implies a guaranteed resource leak / DOS capability, which is a problem though it may or may not be considered a security issue.
Well, that depends; you can slap a rlimit on a process and it will no longer DOS you, even if it's executing Turing complete code. It does mean that inside the Turing machine you can't really apply protections, but from the outside a Turing-complete language cannot magically escape its sandbox unless you give (or accidentally include) it a tool to do so.
Unless I'm completely off my mark, any resource limited system is not true universal Turing machine in the theoretical sense. So depending on your level of pedantry, TC can mean guaranteed DoS.
For the purposes of this conversation, we generally speak of the TC-ness of a system if we could somehow expand it indefinitely.

I believe there was someone who set up a TC-complete roller coaster network in some Rollercoaster Tycoon sort of game. It had something like 25-slots of "RAM". Not very TC-ish. But if you could hypethetically expand out, hypothetically it would work.

So it's a common convention.

But...that's your whole computer too then.
Indeed.