Hacker News new | ask | show | jobs
by kllrnohj 3022 days ago
> Compilers could stop worrying about obscure/old architectures.

No they wouldn't. They still need to turn WebAsm/IR into assembly, which is the thing they already do today anyway. Nothing changes for compilers, other than the potential for optimizations gets much, much worse as the IR is comparatively crippled and restricted to the IR they already have.

> Deploying an application onto multiple platforms is no longer a problem.

This has never been the result of CPU instructions. That's a library problem, not an IR problem. WebAsm does nothing to help with this, particularly as it intentionally has no real standard library to speak of.

Or put another way every compiled program is already on a perfectly portable IR called x86_64. Runs on just about every desktop, laptop, and nearly every server in the world. Yet good luck writing a portable "hello world" in it.

It marginally reduces your release artifacts as you only produce a single webasm set instead of x86, x86_64, arm7, and armv8, but using webasm instead comes at non-trivial costs, too. Instead of compiling once on a known toolchain you're now compiling millions of times on uncontrolled, unknown toolchains. That's not a great trade-off in many, if not most, circumstances.

> Sandboxing is much simpler.

Sandboxing is already a solved problem using process isolation, which has the nice property of not caring how your process runs at all. What benefit does WebAsm add to this?

> Formal verification becomes possible (the WebAssembly spec actually reads like a spec, unlike the C standard which reads more like a religious text).

WebAsm is an intermediate, not a source. Formally verifying it is about as useful as formally verifying assembly. Which is to say, not useful at all. That doesn't help you verify anything about your code, which was a compiler, optimizer, and god knows what else away from the webasm that was generated.

It's good that the spec is actually a spec, but this isn't a unique trait to webasm and it won't help your code any since your code isn't in webasm. It's still in C/C++, Rust, or whatever else and they all remain just as verifiable (or not) as they always were.

3 comments

> Or put another way every compiled program is already on a perfectly portable IR called x86_64. Runs on just about every desktop, laptop, and nearly every server in the world.

But not mobile.

(Besides, I'm also somewhat uneasy with accepting that the computing lingua franca will forever be proprietary to Intel, covered by innumerable patents, and backwards compatible to 1978.)

> Sandboxing is already a solved problem using process isolation

If only it were. Process isolation does not protect against kernel attacks or attacks against whatever IPC mechanism you use to call out to the privileged broker process.

> Formally verifying it is about as useful as formally verifying assembly. Which is to say, not useful at all.

You can verify memory safety of the compiled code, which is a useful and important property.

(I think nobody seriously doubts that Web Assembly is memory safe assuming a bounds checked heap, though, so it's not that practically interesting of a result. Maybe it'll be more so when GC support lands.)

> (Besides, I'm also somewhat uneasy with accepting that the computing lingua franca will forever be proprietary to Intel, covered by innumerable patents, and backwards compatible to 1978.)

Well, x86_64 is actually AMD's creation not Intel's. But replace x86_64 with ARM on mobile and it's the same thing - a portable IR/instruction set does not result in a portable application.

> Process isolation does not protect against kernel attacks or attacks against whatever IPC mechanism you use to call out to the privileged broker process.

Neither does webasm, other than by just not having any features currently. But that's obviously not a viable long-term strategy, certainly not for anything standalone-ish.

> You can verify memory safety of the compiled code, which is a useful and important property.

My native code is perfectly memory safe as well, enforced in hardware even. Has been that way for decades.

Of course that's not what anyone actually means by a "memory safe" language, but as soon as you go plop malloc/free on top of a single webasm allocation you're back to all the same memory-unsafeness of C despite the "memory safe" claims of webasm anyway. A memory safe IR is meaningless outside of the context of being embedded in another process. Aka, when used in a web browser. Or I guess as a massively overcomplicated replacement for Lua.

But the point is verifying the resulting webasm doesn't mean your code was correct, it means your code plus current toolchain selection happened to result in verified code. Whether or not that was the result of a fluke or well defined behavior is not something webasm has any impact on.

> Of course that's not what anyone actually means by a "memory safe" language, but as soon as you go plop malloc/free on top of a single webasm allocation you're back to all the same memory-unsafeness of C despite the "memory safe" claims of webasm anyway.

That's not true. Web Assembly semantics don't allow interpreting data as code or explicitly messing with the machine stack. This guarantees control flow integrity, preventing problems like ROP or traditional buffer overflows.

While webasm isn't vulnurable to traditional ROP attacks, it will also lose at least as much performance penalty as IR + g-free/indirect return.

Memory protection in terms of r/w/x been solved for at least about 3 decades now. It would be absolutely trivial to enforce via binary/IR distribution.

At the end of the day webasm is just a shitty IR that only exists because it's the path of least resistance on the web. There's really no point in going from lang-> llvm -> webasm -> llvm -> exec when you can just go from lang -> llvm -> asm.

RWX doesn't prevent ROP!
Sure but I brought up rwx for their data execution point and g-free/indirect return for ROP mitigation.
However, it only partially addresses JOP (jump-oriented programming, i.e. hijacking calls to function pointers and virtual methods). And there are CFI designs that provide equivalent or better guarantees for native code, such as Clang CFI + SafeStack. In fact, I expect Clang CFI to be more widely adopted in the future… However, the main obstacle to increased adoption has always been overhead, yet WebAssembly has significantly higher overhead.

edit: not to mention that WebAssembly is currently missing security mitigations that long have been standard in native code, such as ASLR, and… maybe stack overflow protection? (It looks like emscripten handles the latter manually by checking STACKTOP against STACK_MAX, but I'm not sure LLVM's native WebAssembly target does.) Maybe these will be addressed in the future, but for now there are some interesting exploitation opportunities.

> However, the main obstacle to increased adoption has always been overhead, yet WebAssembly has significantly higher overhead.

But you get a lot more than security for your trouble using Web Assembly. So the performance-vs.-security tradeoff isn't the only part of the calculus here.

> And there are CFI designs that provide equivalent or better guarantees for native code, such as Clang CFI + SafeStack.

Clang CFI only protects indirect calls. And SafeStack looks like it has issues, according to the Chromium bug: https://bugs.chromium.org/p/chromium/issues/detail?id=505015

"No. We are currently looking at other alternatives (all look grim, though). Before trying to proceed with SafeStack please get the agreement from security folks, since SafeStack doesn't actually sounds too secure any more :("

Indirect calls as opposed to what? C++ virtual method calls are supported by Clang CFI. Direct calls are always safe because the destination address is fixed. (That is, unless you mess with the PLT, but that's what RELRO is for.)

Not sure what's up with SafeStack - though I bet it has to do with more hardware timing attacks, in this case to leak the address. The whole design is a bit of a hack since the only thing preventing the attacker from accessing the safe stack is their (theoretical) inability to guess the address. If only x86-64 hadn't gotten rid of segmentation, so normal memory accesses and stack accesses could actually use entirely separate memory regions… On the other hand, Intel CET should allow for some subset of that functionality on future hardware.

But again, to be fair, one should note that "grim" has a different meaning when the budget for acceptable performance loss is perhaps 1-5%, not 30-50% :P

> No they wouldn't. They still need to turn WebAsm/IR into assembly, which is the thing they already do today anyway. Nothing changes for compilers, other than the potential for optimizations gets much, much worse as the IR is comparatively crippled and restricted to the IR they already have.

Most compilers today have separate assembly generation for MIPS, ARM, x86_64. They could turn source into WebAssembly and no more (the job of WebAssembly -> native is left to some other architecture specific compiler).

> This has never been the result of CPU instructions. That's a library problem, not an IR problem. WebAsm does nothing to help with this, particularly as it intentionally has no real standard library to speak of.

If any one language targets WebAssembly, as long as you resolve your libraries within that language, you'll be able to deploy to any target that supports WebAssembly. This is pretty much the defacto solution to the library problem in a variety of ecosystems: in Java you'll make a fatjar and in C/C++/Rust you'll make a staticly linked binary.

> WebAsm is an intermediate, not a source. Formally verifying it is about as useful as formally verifying assembly. Which is to say, not useful at all. That doesn't help you verify anything about your code, which was a compiler, optimizer, and god knows what else away from the webasm that was generated.

Are you familiar with binary analysis?

But why use webasm for that when llvm already does it better?
I'm not sure I understand: WebAssembly is the output of a (hopefully) optimizing compiler. LLVM is such a compiler backend. If you use WebAssembly today, you are probably going through LLVM.

Perhaps you meant: why not use LLVM IR instead of WebAssembly? If so, allow me to refer you to this comment[1] (from a bit further down in this thread).

[1]: https://news.ycombinator.com/item?id=16586239

>. Nothing changes for compilers, other than the potential for optimizations gets much, much worse as the IR is comparatively crippled and restricted to the IR they already have.

The performance of an application that doesn't work is worse than a cripple application.