Hacker News new | ask | show | jobs
by xuejie 2377 days ago
While I do agree WASM makes a lot of sense for the Web, I personally am having doubts regarding treating WASM as a general abstractions for native code as used in the post. For this case, it might suit the job better to have a bytecode that resembles more of underlying machine architecture, rather than a still highly abstracted model like WASM. Please don't get me wrong, I do agree WASM is already one step ahead of, say, JavaScript, but we can do better than that.

The problem with WASM here, is that it really is a bloated model like JVM in its early days, huge amount of work is needed to make it closer to native speed, which is contradictory to the original slogan. What's more, people are still planning to add tons of new features to it: https://webassembly.org/docs/future-features/. Before you tell me those are opt-in features, the question I want to raise is: for an abstraction of general platform, you would definitely want to have a widely accepted standard so people know what features will be expected, one example is that people know SSE will be available for 64-bit x86 code.

With all those opt-in features, I doubt if we can have a proper layer that adapts well to different implementations with different supported features. We might end up with the situation like Rust, where you can claim a secondary compiler could exist, but in practice people are all using the same compiler/implementation.

6 comments

And yet, it will probably be used for that and become popular.

Tech doesn't need to be perfect, the best, or even very good to win. They need to have a killer feature and a low cost of adoption.

Wasm seems on the right tracker for that.

> need to have a killer feature and a low cost of adoption

This is a powerful statement actually. Could be applied to any tech startup product.

I totally agree that it's not the best tech always wins, that's why I'm pointing it out, I really wish that 5 years from now, we can rely on something that makes sense to be there, not just because something has a low cost of adoption.
It sounds like you're bringing up a register-based vs stack-based VM argument, and claiming that register-based VMs have better performance because their model is closer to the hardware.

My understanding is that this intuition is usually untrue, because a JIT benefits from the stack-based code preserving code flow and thus allowing more efficient code generation.

No I'm not talking about register-based vs stack-based VM, that's a totally different topic. I'm just saying WASM is still quite distant from real hardware, making it a non-trivial task to performantly run the code. In fact if you look at the asm.js, which is the original inspiration of WASM, it is a much closer mode to real hardware.

And of course JIT can make WASM fast but if you look around, building a performant WASM JIT still remains terribly hard, some implementation even needs LLVM to perform optimizations. I'd say if this is the case, we must've chosen the wrong model.

> some implementation even needs LLVM to perform optimizations. I'd say if this is the case, we must've chosen the wrong model.

Why? Optimizing machine-independent code for a particular machine is part of the "core business" of LLVM, up to the point where a sufficiently capable bytecode/optimizer becomes comparable to LLVM.

OTOH, if the main argument here is the size/speed/other weight of LLVM, then of course the host machine that wants to run WASM only needs a tiny subet of LLVM: no frontend, single backend, only a subset of optimization passes... There is also a tradeoff to make to leave optimization passes out that actually improve the code a bit but are too heavy for the host machine.

There's nothing wrong with LLVM itself, my point is we could've picked a lower level model which don't need a complicated setup like LLVM. Or one that you can direct shipped optimized compiled result of LLVM, that will be a much better world
Doesn't ARM code emulated on x86 (and vice versa) perform even worse than WebAssembly? Isn't that essentially what you would get with a lower-level "optimized compiled result of LLVM"?
Does the fact that ARM is a bad choice disproves all other choices other than WASM? I'm not sure this is a good argument here. Lower-level bytecodes are more flourishing than just ARM.
I believe that you are looking at wasm with a different priorities than intended. The two fundamental properties are that it must be fully portable and fully secure by default (as in any insecurity needs to be explicitly and statically declared in the bytecode).

Performance comes only after those two. LLVM as far as I know, has a completely different order of priorities.

Personally I don't see why we cannot get all three.
Why is it hard? Isn't wasm designed so you can statically and quickly compile pieces of it or the whole thing to native code, rather than needing to do all the tricks dynamic language runtimes do?
That is their very nice slogan, while in reality WASM still has quite a way to go to compete with native code.

Some shits I see these days are that when code speed is measured, people compare that with JS but not native code, when portability is talked about, the comparison is then made against native code, not JS.

What major implementations are using LLVM? Firefox is using Cranelift, Chrome is using V8, both of these shouldn't be using LLVM, AFAIK, or am I wrong?
wasmer [1] has a LLVM backend, WAVM [2] uses LLVM as the bakend, I could be wrong but last time I checked, cranelift is only meant to be the next generation WASM engine used in Firefox, it is not yet in production.

And actually the argument is: all of v8, Firefox/Cranelift and LLVM used in wasmer requires non-trivial work to make WASM fast, which shouldn't be needed given a different model.

[1] https://github.com/wasmerio/wasmer/tree/master/lib/llvm-back... [2] https://github.com/WAVM/WAVM

I highly doubt that there is another model that would not require non-trivial work to be fast, while also being reasonably portable to different architectures.

Sure, we could be faster by just sending x86 machine code, but that isn't really the point.

LLVM more usable for cloud vendors and such who use wasm outside of the browser. So we do have 3 quality implementations already.
> The problem with WASM here, is that it really is a bloated model like JVM in its early days,

Isn't WASM (as of its MVP) quite simple VM model compared to other VMs?

I agree with your concern about the new futures though. It might introduce another hell of segmentation.

> bytecode that resembles more of underlying machine architecture

In what way? This discussion lacks specifics. Doesn't this also risk tying you to a specific machine, which is the opposite of the intent?

That is not an issue regarding bytecode formats in general, given that in some platforms only the kernel does the final compilation to machine code, and they are a common executable format since early 60's.

However I do agree with WASM everywhere fashion complaint.

Besides, doesn't LLVM already have an IR which servers as one such higher-level abstraction?
LLVM IR is unstable, way too big in scope, and not as machine-independent as people think. There have been several efforts to use it as a target-independent high-level bytecode, and they either are very platform-specific single-vendor affairs (Apple) or were retired in favour of a simpler new language that doesn't have its problems (SPIR became SPIR-V, PNaCl became WebAssembly).
No, LLVM IR is machine specific. Any "native" language is since the ABI of a struct will depend e.g. on the size of pointers for that platform.

E.g. Consider in C int foo[sizeof(void*)];

Open source LLVM IR is machine specific, it doesn't have to be, as proven by watchOS bitcode, or PNaCL.
Both of the examples you gave were very carefully architected to make sure that was the case.
Which doesn't prevent the case of someone contributing back such kind of variants.
> it doesn't have to be, as proven by watchOS bitcode, or PNaCL.

Both of those have fixed 32-bit pointer sizes and are little-endian. When you compile for watchOS bitcode or PNaCL you just target a single virtual machine & "system" ABI. LLVM IR or any related techniques won't ever allow you to produce a 32 / 64 bit or ARM / x86 app that is able to leverage the whole feature set of the platform from the same bitcode.

Yet watchOS migrated from 32 bit to 64 bit.

It is possible, LLVM project just needs to actually want to support such use cases in a portable way.

> Yet watchOS migrated from 32 bit to 64 bit.

that is what they said in the press release but in practice they migrated from "classical" 32bit ARM to ILP-32 (akin to the x32 ABI on linux) so the size of pointers, etc etc does not change from 32-bit. You get more registers & stuff like that which is nice, but that is not moving to 64 bit, just having nicer 32 bit execution on 64 bit CPUs. If you want proper aarch64 support on WatchOS you have to recompile.

Google created pretty much this, called it PNaCl and shipped it in Chrome. It now has been retired in favor of WASM.
Because Mozilla went political and came up with asm.js as counter technology.
Or as some other (like, eg Google itself) would say:

> Because Mozilla went political and came up with asm.js as better technology.

I am pretty sure that asm.js would not happened if Chrome already had the market share it enjoys nowadays.

WASM is still catching up to PNaCL in performance, hardly better.

I'm pretty sure it is not stable and not designed for such a use case.
Apple kind of disagrees with watchOS bitcode.
"not designed for that use case" != "isn't used for that use case in practice"

Apple has tight control over the bitcode version and the target platforms supported by Xcode. That's not the same thing as accepting arbitrary LLVM bitcode files.

Nothing prevents LLVM project to adopt such variant, other than unwillingness to do so.