Hacker News new | ask | show | jobs
by basique 1032 days ago
The JVM can't really be properly sandboxed, though. Even the JDK developers have stopped trying and deprecated SecurityManager. On the other hand, WASM is specifically designed to not really be able to do anything fancy unless you give it functions that actually do something externally. Besides, how would you even properly run C code on the JVM?
1 comments

> The JVM can't really be properly sandboxed

What happened is that people realized that blacklisting does not work. Whitelisting is the correct approach. There is absolutely zero reason why WASM would be better for that over the JVM — the JVM spec in itself has no visible side effect, not even printing, so it can’t do anything nefarious (besides cpu vulnerabilities, but that also apply to WASM).

And you would run C code in a completely trivial way: you have a huge array which is your memory, and you read/write bytes to it.

> Whitelisting is the correct approach. Isn't most of the point of using the JVM thrown out if you have a limited standard library?

> And you would run C code in a completely trivial way: you have a huge array which is your memory, and you read/write bytes to it. That sounds like it would have terrible performance. Would every read of an int have to manually build it from the 4 bytes it's made of? This just seems like something the JVM won't handle that well.

Besides, this would mean that if you want to run a language like C#, which has references and value types (and you can make a reference to a value type from a pointer into a buffer), you would have to emulate them, which will hamper performance, or just use the same strategy as you used for C.

> That sounds like it would have terrible performance. Would every read of an int have to manually build it from the 4 bytes it's made of?

No, a method can have a native implementation, or a compiler intrinsic. Java has ByteBuffers (https://download.java.net/java/early_access/panama/docs/api/... ), and they have put/get{Long,Short,etc}, that will map to either a native single pointer read with the given size, or even optimize to a more efficient read inside the context of a bigger method, say vectorize them inside a loop (as the compiler knows about this method and can handle it specifically).

> Besides, this would mean that if you want to run a language like C#, which has references and value types (and you can make a reference to a value type from a pointer into a buffer), you would have to emulate them, which will hamper performance, or just use the same strategy as you used for C

So we are back at where WASM is? Though I think that there is nothing inherently impossible about mapping C# to more efficient Java primitives, especially that value types are the more constrained semantics, each instance being different is semantically the same, just less optimal. A pointer to a region of the aforementioned ByteBuffer with a special memoryToCSharpObject() that has some compiler intrinsics wouldn't perform too bad, I believe.

> So we are back at where WASM is?

Exactly. By the time you've done all that, whats the point? You'll have reimplemented wasm on top of the jvm in a way that:

- Can't use any of the java standard library (which all existing java code depends on)

- So you also can't use any existing java code

- You need a compilation step for any existing code in other languages to convert it into your special limited java syntax

- And the result will run slower than wasm because of the extra layer of abstraction going on

The one advantage is that you can output everything as standard .class files, so it'll be easier to pull this code into an existing java application.

And yes, you could absolutely do that if you want and you see value in it. Heck - maybe it'd be nice to have a wasm-to-java-class converter to let you reuse all the wasm infrastructure.

But this all sounds like the java equivalent to asmjs, which we already tried. Asmjs was the precursor to wasm. It was built on top of javascript primitives in much the same way as the proposal here builds on top of java primitives. As I understand it, the reason we ended up with wasm instead of asmjs was that:

- Wasm was easier to optimize, had a smaller file size and was faster to load at runtime (since you don't need a javascript compiler)

- Its much easier to make VMs for wasm in lots of languages and environments. Wasm modules can be loaded into programs written in python, rust, C, java, go, javascript, etc without some weird unnecessary dependency on javascript.

> Can't use any of the java standard library (which all existing java code depends on)

One can surely cherry-pick quite a lot out of it that are safe to use, e.g. anything not using `native` implementation can as per the former definition of JVM interpreter, also end up being safe. So many existing code would run without any change.

> Re: compilation step

Why? It has nothing to do with Java, the language, only the JVM class file format. Scala/Kotlin produce bytecode directly, not Java code.

> Run slower

Why would it? Wasm can run fast having grown out from asm.js, yet a JITted runtime that runs half of the internet can't?

> Its much easier to make VMs for wasm in lots of languages and environments

There are plenty toy JVMs out there as well, the core really is not difficult.

And the point of all this would be that large chunks of the JVM could have been reused, that runs on every platform with top notch performance already, without all the growing pains that WASM will experience. Also, the JVM's type safe cross-language capabilities are already here, while they are very experimental and rudimentary in case of WASM. I can just call a Kotlin class from Clojure, or JPython or whatever, and vice versa.

> One can surely cherry-pick quite a lot out of it that are safe to use, e.g. anything not using `native` implementation can as per the former definition of JVM interpreter, also end up being safe. So many existing code would run without any change.

Java already tried - many times - to have a "safe subset" of the language. As I understand it, the effort was finally abandoned with the deprecation of SecurityManager. But even if you made a safe subset of java, you've only gone from 1 working browser language (javascript) to 2 (javascript and java). Well, and the compile-to-JS & compile-to-JVM languages.

> Also, the JVM's type safe cross-language capabilities are already here

But that leaves out a lot of languages that people care about. Important languages, like C, C++ and Rust, C#, Go and Python. These languages don't compile to the JVM because of limitations in the java bytecode format and type system.

I mean, how would you make C or Go run on the JVM?

Perhaps java's type system could be extended to support these languages too. But it would probably take 20 years of committee meetings to do it. And the time to start that work was 20 years ago. At this point its much easier to start fresh with something new like wasm.

[Edit: In another thread you mentioned graalvm has been made to support any LLVM code. Thats great, and perhaps addresses this point entirely. I don't know enough about it to know.]

> Why would it [run slower]? Wasm can run fast having grown out from asm.js, yet a JITted runtime that runs half of the internet can't?

The challenge is making existing C code run at near-native speeds. How would you do that with your proposal? You'd need to first compile your C to java classes and then ran the code in the JVM. But java has no native pointers or direct access to an allocator. Java's GC alone would incur a massive performance penalty to C code.

Asmjs was made fast by making javascript engines contain a second, separate compiler for the asmjs dialect. The compiler detected asmjs explicitly and essentially used asmjs as a weird, expensive syntax for a different language. I guess java could do that too. We could have a special .class file syntax subset and add special compiler logic to the jvm to optimize that particular dialect. But if you're going to go down that road, why bother with the JVM? Javascript is already in the browser. For that matter, why not just compile your java classes to javascript?

> There are plenty toy JVMs out there as well, the core really is not difficult.

I'm confused - are you proposing just porting across the core of java, or java with a lot of its standard library? Given the size of the JRE, those are two very different propositions! The former is possible but pretty useless, and the latter is a massive job that the java community has already tried and as I understand it, eventually given up on.

> Also, the JVM's type safe cross-language capabilities are already here,

Oh no, this keeps getting worse. You want to put java's FFI system in browsers too?

The problem with Java's various FFIs is that (AFAIK) they're not designed to be a security boundary. We'd need to redesign them all. The surface area is massive and given java's historically abysmal security record, I don't trust the java engineers to make any of this stuff safe or secure. And neither should you.

No, it sounds like you want to bring most of the JVM into the browser to solve a problem I don't have (run existing java code in the browser). And this still doesn't address the problem that wasm solves (let us run any language in the browser, safely and without a loss in performance). And I can't see any straightforward way to address the security problems or the performance problems that this would entail.

In comparison, I'm much happier with the design of wasm. By making a small, simple new bytecode format which explicitly supports C and C-like languages, all wasm languages run fast by default. Wasm is dead easy to sandbox and easy to audit. And this gives us the best of both worlds, since we should be able to compile Java to wasm like any other language. Better yet - this way you can bring in as much of the JVM as you want into your projects, and there's no need to do the solve the impossible problem of making the JVM secure.