Hacker News new | ask | show | jobs
by BrendanEich 4533 days ago
Java has a bytecode for its client embedding; so does Flash ActionScript. This led to trouble. From http://brendaneich.github.io/Strange-Loop-2012/#/27, some pros for JS and cons for bytecode:

* Dynamic typing ⇒ no verification

* Type inference ⇒ delayed optimization

* Would bytecode compress as well?

* Bytecode standardization would suck

* Bytecode versioning would suck more

* Low-level bytecode is future-hostile

Remember Java bytecode backward compatibility hampering language evolution, in the generics (erasure) debate and result. Then they broke bytecode compat anyway.

Flash has two language implementations in it, one for AS2 and the other (Tamarin) for AS3. Only way to be sure about AS2 compat!

In many ways, with JS you have one problem; add bytecode and now you have two.

1 comments

But asm.js is a bytecode, isn't it? Just with a clever-but-weird encoding that allows a backward compatibility.

In a same manner, one can deliver, for example, an x86 bytecode in JS-encoded form. Just encode opcodes as, say, "eax = 1" instead of "\xB8\x01\0\0\0".

No, asm.js is a JS subset. "bytecode" as boosted here would be a non-subset, like JVML to Java source.

Sure, bits is bits. Doesn't matter if you're after gzipped good results. But bytecode hopes spring eternal and the hopers do not want gzipped, minified, Emscripten-produced asm.js. They want a different syntax.

In all the ways that matter, asm.js is a bytecode with a funny encoding and peculiar semantics related to that. Denying it doesn't really help.

We're all wishing for a sane bytecode for a change. It's not just syntax.

You didn't respond to the "now you have two problems" point.

Keeping asm.js a subset of JS avoids all the back-compat-locking/future-hostile-lowering problems. And engines have only one parser to make super-fast. (Already there.)

This is a significant win. What your "sane" means is mostly aesthetics. asm.js already has int32 and uint32 conversions and casts. There is no big semantic gap for vanilla C/C++ source to JS. Typed array views help a lot here; JS's built-in operators and a few helpers in ES6 (Math.imul, polyfillable) do the rest.

The non-vanilla gaps of note are mostly gaps in JS (e.g., int64, uint64, SIMD), which we're filling in ES7 for many reasons.

Shared memory threads are indeed a gap to confine to a checked subset, not push into JS along with data races (VM-level and usercode-level -- this is fatal). We're working on that too, but it's not a "bytecode" issue _per se_.

If you continue to believe that "it's not just syntax", and you have something in the way of semantics other than the above in mind, please state it explicitly.

I was thinking of int types, SIMD, shared memory with an explicit model, and arbitrary jumps. It's nice that some of those are getting fixed.

I would also like to see a bytecode with structures (and alignment control), nice code generation and execution and some form of virtual memory for resources.

What I don't understand is why Mozilla didn't define a bytecode and a to-JS compiler for it. Browsers without support would have been just as slow, but there would have been much more room for evolution.

I'm almost expecting Mozilla to pull a Trojan any day now: define a bytecode that compiles to asm.js and declare asm.js deprecated.

> What I don't understand is why Mozilla didn't define a bytecode and a to-JS compiler for it

If nothing else, Mozilla does not have nearly as much money as Google, and really cannot afford a "tear down everything and rebuild it" approach like PNaCl. An advantage of OdinMonkey is that it is able to reuse a huge chunk of the SpiderMonkey infrastructure. If you look at the diagram in the article with the red arrow, the "Ion-compile" step is the same size as the others, but in reality is a huge number of line of code, representing multiple engineer-years of ongoing work.

Much of the work the article describes, such as parallel Ion compilation, was carried out mostly for the benefit of non-asm.js compilation, but because OdinMonkey shares the infrastructure, it was able to benefit from it without having to create a new implementation from scratch.

Beyond the engineering of the implementation itself, asm.js itself is a natural evolution of Emscripten-style JS, which had already successfully demonstrated that it can be used for large applications, and that other non-Mozilla browsers are interested in it enough to have done optimization work for it. This reduces the risk that the design itself is technically broken in some way that wouldn't be apparent until people try to run large production-ready applications, as well as the risk that only Firefox will ever be able to run asm.js code at a decent speed.

int types are in JS and have been since 1995, due to the bitwise logical and shift ops.

arbitrary jumps are a deoptimizer and verifier hazard (Java's verifier had O(n^4) complexity DoS attack demo'd by Michael Franz and his group at UCI). Do not want.

SIMD is coming and wanted in hand-coded JS too, not a bytecode issue per se.

> What I don't understand is why Mozilla didn't define a bytecode and a to-JS compiler for it. Browsers without support would have been just as slow, but there would have been much more room for evolution.

You mix speed ("just as slow" -- wait, we're fast at asm.js input, faster on startup than PNaCl -- did you read the post?) with "room for evolution". I just argued above that having two syntaxes hurts evolution. Please separate speed from evolution and address my argument.

Mozilla is in no position, philosophically or market-share-wise, to "pull a Trojan". Also, my argument stands no matter who is making it. No ad hominem fallacies, please!

JVML is not a bytecode. The bytecode syntax is just a disguise . JVML is a high level language that prescribes a certain object / method / inheritance model. Methods are associated with objects according to specific vtable / vinterface rules.

OTOH, asm.js is defined in terms of value types + function pointers. Just call the function pointer with the right arguments. Bring whichever objects / closures you like.

PS. Gripe of the day: 64bit computing is here (even ARM supports it) and asm.js doesn't seem to be prepared.

Lack of 64-bit ints is a JS problem, asm.js gets them via ES7. See

https://bugzilla.mozilla.org/show_bug.cgi?id=749786

and the value objects strawman under construction for ES7.

> No, asm.js is a JS subset. "bytecode" as boosted here would be a non-subset, like JVML to Java source.

Sorry for quoting Wikipedia, but bytecode is just a form of instruction set designed for efficient execution by a software interpreter.

Maybe I'm mistaken on this, but from reading about asm.js I got an impression that asm.js-aware browsers use different approach to asm.js code and treat it more like a weirdly-encoded bytecode, not as an ordirary JS source. Or I'm misunderstanding things?

If so, asm.js is a bytecode. Whenever there's a correspondence between it and other languages doesn't matter for determining if it's bytecode or not, it's another (useful, but not related to being bytecode) property.

> They want a different syntax.

I don't think syntax matters that much, it's mostly semantics. Probably.

actually, asm.js is just javascript. Basically, Mozilla looked at what sort of javascript code that the different JIT's allready handle really well, and made a specification out of it. So even in Chrome, asm.js will run very efficiently. Mozilla figuered out a way to write javascript code that made type-information easy to extract, which again makes it easy to AOT-compile. For instance, the following code:

    function asmjs(i) {
        i = i|0;
        return (i + 1)|0;
    }
is valid javascript, and you can easily write this in your own programs. The "|0" means that the variable will be converted to a integer, because it is specified in the javascript standard. As an optimization, you can use this as a type annotation, kind of like writing "int i = 0;" This is what asm.js is in a nutshell, and why it's so easy to implement a special compiler for it.
You are replacing the "bytecode" objection, which is about syntax, with your own non-objection equating asm.js with a bytecode like JVML. I'm happy you're ok with asm.js, but those who are not, and who demand "bytecode", do care about syntax first.