Hacker News new | ask | show | jobs
by sklogic 3782 days ago
They're solving a non-issue. The rest of the world is perfectly fine with the statically typed, well designed languages that are easy to compile. And only the web world is so obssessed with smart compilers compensating (impressively, but still far from being sufficient) for multiple deficiencies in the language design.
2 comments

> The rest of the world is perfectly fine with the statically typed, well designed languages that are easy to compile.

Python, Ruby, PHP, and Perl aren't "the rest of the world"? As far as compilers are concerned, all of those languages have more troublesome semantics than JavaScript does.

> And only the web world is so obssessed with smart compilers compensating (impressively, but still far from being sufficient) for multiple deficiencies in the language design.

You have no idea how much compilers have to compensate for the deficiencies in C and C++'s design.

Nobody really cares about their performance. They're just fine with their simple interpreters. Web is different, there is no choice, no fallback to C.

And no, thank you kind sir, but I've got a very good idea of what compilers are doing wrt. C deficiencies, I was writing OpenCL compilers for 6 years at least. Besides aliasing stupidity and byte-addressing there is nothing really bad to compensate for.

> Web is different, there is no choice, no fallback to C.

asm.js and Web Assembly.

> Besides aliasing stupidity and byte-addressing there is nothing really bad to compensate for.

Aliasing issues, C++ heavy reliance on virtual methods, too many levels of indirection in the STL, overuse of signed integers due to "int" being easier to type interfering with loop analysis, const being useless for optimization, slow parsing of header files...

All of this!

For example in JS, once you prove that "o.f = v" is not going to hit a setter (or you put a speculative check to that effect), then you know that this effect does not interfere with "v = o.g" (provided you check that it's not a getter). JS VMs are really good at speculating about getters and setters. The result is that alias analysis, and its clients like common subexpression elimination, are super effective. It's only a matter of time before we're creating function summaries that list all of the abstract heaps that a procedure can clobber so even a function call has bounded interference.

This is totally different from C. There, making sure that accesses to o->f and o->g don't interfere is like pulling teeth. And the user can force you to assume that they always interfere by using -fno-strict-aliasing, which is hugely popular.

The fallback-to-C paths on the web aren't that great, though. At least not yet. Once you fall back to C, you're sandboxed into a linear heap with limited access to the DOM and JS heap. But we'll fix that eventually. :-)

Web Assembly is not even there yet, asm.js up until recent was more a toy and a standalone runtime, I have not seen it being routinely used for a fallback, nothing like, say, python with C modules.

As for virtual methods, it is a problem of a bad C++, devirtualisation may help, but nobody cares in general. We have a cool curiously recurring template pattern instead.

But for the integers you're right. And signedness is still the most annoying source of bugs in the infamous LLVM instcombine.

Headers are a frontend issue, nothing to do with the optimisations.

> Web Assembly is not even there yet, asm.js up until recent was more a toy and a standalone runtime, I have not seen it being routinely used for a fallback, nothing like, say, python with C modules.

That's because (a) page performance is frequently gated on things other than JavaScript, so people don't go through a lot of trouble to write C++; (b) many modules that would be written in C in Python are provided by the browser itself; (c) JS itself is usually fast enough, since the gap between JS and C++ is much less than the gap between Python and C++.

> As for virtual methods, it is a problem of a bad C++, devirtualisation may help, but nobody cares in general.

Huh? Tons of people care about devirtualization! Much of the reason Firefox builds go to the trouble of PGO (and it is a huge amount of trouble) is to get devirtualization.

> As for virtual methods, it is a problem of a bad C++

So I could say the same thing about JavaScript: if it's slow, you're writing "bad JS". But you would rightly reject that as invalid: if the code people write in practice is slow, then the problem is with the language encouraging people to write slow code. The point is that the same thing applies to C++.

See? The evolution of the language and its ecosystem was driven by the peculiar web needs. No surprise the language became what it is now. No surprise it sucks shit every time it spills out of its native domain.

As for C++ - there is a choice. With JS the choice is much more limited. You either rely on non-standard asm.js behaviour, lose DOM access with web assembly, or tolerate the stupidity and limitations of the language that far overgrown its tiny niche.

Btw., I am yet to hear how a language with a fixed syntax and no macros can be perceived as "highly expressive".

Re: int, isn't it the reverse? I.e. don't compilers find loops using ints easier to optimise because they can assume the counter does not wrap?

Also, what do you mean by level of indirection in the STL? Pointer chasing or, I suspect, layers of layers of tiny functions that need to be inlined for reasonable performance?

> statically typed, well designed languages that are easy to compile

Which ones are these? All of the statically-typed and well-designed languages I can think of are, at the least, hard to compile well, if not hard to compile in the first place. (Haskell and Rust both come to mind; there are few Haskell compilers other than GHC, and no Rust compilers other than rustc.) The languages that are easier to compile are either not statically-typed in a useful way, or not well-designed, or both.

SML? Pascal?

Edit: I was really referring to language families, so including things like OCaml, Modula, etc.

Rust is actually quite easy to compile. Ada is easy to compile. ML-like languages are easy to compile. Oberon is trivial to compile.
I've never heard anyone say Ada was easy to compile. Also, I heard its uptake was slowed by how hard it was to get the early compilers working. So, where did you see an easy one? Seriously, because Ada still needs a CompCert (or at least FLINT) style certified compiler. An easy Ada compiler would be a nice start on that.
I always thought that Ada was easy to compile in the backend (i.e. the moral equivalent of what B3 does) but challenging on the frontend because of how large the language is. Could be wrong though.

The frontend is usually the hard part IMO. In WebKit, we have 100,000 lines of code for our "frontend" (i.e. the DFG compiler) and 50,000 lines of code for our "backend" (i.e. B3). That split is really interesting.

It's what I thought, too. The split may come from the fact that the front-end is more complex, richer in information, harder to analyze, and harder to transform. The more you can do with it the more code it takes to represent. That's my hypothesis.

Far as Ada, I figured it would be difficult due to the large number of language features and complexities of analyzing a program for safety w/ all its rules. And any interactions that might emerge in that between features and rules. Could be easier than I think but my default belief is that it's not easy.

I am talking about the backend optimisations, not the frontend parts. Ada frontend got tons of sweet static information for the backend to consume.
That makes more sense. Did you ever work on any Ada compiler I might have heard of?
Nothing fancy, I was simply implementing a subset of Ada for fun. I find it is the quickest way to learn or explore a language - write a compiler for a subset of it.

Frontend was awful and I never finished it, but the LLVM-style backend wad very easy to build.

You are probably right, assuming input programs are correct. Rust (and Haskell) is not easy to _type-check_ though.
You can't compile Rust without types and inside functions, most types come from inference and coercions, so "type checking" is really "type resolution".
> Rust (and Haskell) is not easy to _type-check_ though.

Is that really the case? It seems to me that a Prolog-like resolution system, coupled with a constraint solver, would get most of the job done with little effort.

There are certainly many rules to keep track of, but in some cases the newer rules are strict generalisations of the older rules. For example, Haskell's original type classes are just a special-case of modern multi-parameter, flexible instances/contexts, etc. type classes. Likewise, we can implement constraint kinds, type class instance resolution and type equalities using one constraint solver.

Hindley-Milner is already quite Prolog-like. But you're right, CLP(fd) rocks for typing.

My usual approach to implementing a type system is to derive a flat list of Prolog equations out of the AST and leave it to Prolog for solving. If you ask what to do with error messages, I've got a comprehensive answer, but it is not for a mobile phone typing.

I'd really appreciate further comments on what to do with type errors for prolog/minikanren-like tools as typecheckers
I don't understand this perspective. Is the point that JavaScript just works such that you don't have to worry about it compiling at runtime?

To me this is the difference between knowing that your program will probably work after it compiles, vs not knowing until after its deployed.

What I think people really enjoy in dynamic languages like JavaScript is the instant feedback, just reloading the browser. This can be accomplished with decent IDEs for most statically typed languages.

I don't believe anyone's arguing that the overhead of static checking is not worthwhile, merely that it involves serious work on the part of the compiler developers that should not be dismissed.