Hacker News new | ask | show | jobs
by eugeneionesco 3130 days ago
>The webassembly exploit part of the chain bums me out (I was always afraid of stuff like this when I was working on the design for it) but it's pretty uninteresting, really. The simple sort of bug you get when you insist on writing stuff in C++.

I really hope people don't think webassembly is the fault for this, this vulnerability is no different from any other memory corruption vulnerability you would find in the js interpreter or the css parser or whatever.

3 comments

Well, WebAssembly's primary near term contribution will be introducing the world of C++ exploits to web apps, which are already groaning under the load of XSS, XSRF, path traversal, SSRF etc attacks. Adding double frees, use after frees and buffer overflows on top doesn't seem ideal.

As for the rest, well, it'd be nice if there was any sort of plan to make Blink safer. I know about oilpan but what Mozilla is doing with Rust is impressive. The JVM guys are working on rewriting the JVM in Java. What's Blink's plan to make its own code safer? Sandboxing alone?

Microsoft is also taking steps to incrementally rewrite .NET runtime in C#.

https://www.infoq.com/articles/virtual-panel-dotnet-future

And the D guys have been rewriting dmd, the reference compiler, into D.

What exploits are you specifically worried about with wasm?

The ones you call out in this post don’t have the same impact as native, even when it’s C or C++ compiled to wasm.

http://foo.bar.com/url?q=<base64 encoded stuff>

wasm program parses q

stack smash occurs

ROP chain is used to gain code execution

user cookie is stolen

attacker now controls your account

I don't know enough about wasm to know if it has some special mitigations for this but when I looked at it, wasm seemed closer to a CPU emulation than a high level language VM. Flat memory space, no GC, no pointer maps.

WASM memory is a set of memory specific to a module (and they only allow one memory instance right now). It can be imported/exported to other modules, but there is no sandbox escape (in theory). For the web backend, it's just backed by a UInt8Array IIRC. It's all userland. If anything escapes the WASM interpreter/compiler, it is the fault of the interpreter/compiler (as is the case here) and not the fault of the WASM bytecode itself which has no escape mechanism. Think of a WASM VM just like a JS VM. Even though it may appear low level just because it can JIT better/cleaner, it operates in the same arena as JIT'd JS (at least for the web target).
You don't need to escape a sandbox when the application has access to all the user's data.

The attack surface of a gmail implemented in C++-compiled-to-wasm is almost certainly going to be larger than a gmail implemented in JS, because the runtime environment is vulnerable to double frees and heap corruption and other attacks, even if it won't escape the browser sandbox. My gmail tab basically has access to my entire life.

I don't understand. In the gmail example, the attack surface to who, a malicious email sender? As in something being handled by wasm in the browser has a better chance at XSS than if it was handled with JS? Why would untrusted content like that be handled by a client-side language anyways? Whether it is wasm, JS, wasm-interpreted-by-a-JS-interpreter, JS-interpreted-by-a-c++-intrpreter, wasm-interpreted-by-a-c++-interpreter or whatever the risks are similar. If you are talking about untrusted wasm or JS scripts accessing things inside the same sandbox, that's a different vector and it's less about the size of surface area and more about the introduction of the vector in the first place.
WASM resulted in adding a lot of new API to JS, like thread-shared buffers and coming atomics. This requires quite a few new lines of native code in the implementation significantly increasing attack surface. Another thing is that WASM makes code faster so exploiting timing bugs or cache leaks gets easier.
My sibling comment is correct; the only way this can happen is an interpreter bug. Bugs happen, but they can happen in JS too. I think you’re assuming things the spec doesn’t allow.
"Controls your account" is possible without ever exploiting an interpreter bug or escaping the sandbox. Your account credentials are usually available inside the current tab.
I’m not sure what you mean, specifically, here. Or at least, how wasm is somehow worse than JavaScript in this regard, which is the baseline here.

In fact, it should be better, given the static declaration of external calls that can be inspected.

I recently read a paper about security exploits in WebGL, thanks to bugs on shader compilers and drivers.

"Automated Testing of Graphics Shader Compilers"

http://multicore.doc.ic.ac.uk/publications/oopsla-17.html

> I really hope people don't think webassembly is the fault for this

Nah, I think it's pretty clear GP meant "when you insist on writing interpreters/compilers in C++" not that C++ was compiled into wasm.

Yeah, sorry for being unclear - that is what I meant. I don't see wasm as at fault here, it's just a bummer that this new attack surface was introduced by writing the wasm implementation in C++ instead of memory-safe languages. It's not something so complex that it really needs to be C++.

Most (all?) browser wasm backends function by just generating the internal IR used by the existing JS runtime, so it's not especially necessary to write the loader/generator in C++. The generated native module(s) are often cached, also, which diminishes the importance of making the generator fast at the cost of safety.

I wrote all the original encoder and decoder prototypes in JS for this reason - you can make it fast enough, and the browser already has a high-performance environment in which you can run that decoder. When the result is already being cached I think writing this in C++ is a dangerous premature optimization.

Similarly it's common to write decoders as a bunch of switch statements and bitpacking, which creates a lot of duplication and space for bugs to hide. You can build these things generally out of a smaller set of robust primitives to limit attack surface, but that wasn't done here either, despite my best efforts.