Hacker News new | ask | show | jobs
by kazinator 4269 days ago
> if the program containing (symbol-name x) and (cos x) works in an interpreter but not in the compiler then the compiler is broken.

The compiler isn't broken.

The interpreter throws an exception, which meets your own definition of "works".

The compiler gives an opinion "x has an inconsistent type". In this manner, the code also "works" in the compiler.

You can still run the program if you like; then that opinion becomes an exception if an input case exercises the code, proving the compiler's opinion right.

Compilers for dynamic languages do not preserve all interpreted behaviors. This is usually explicitly rejected as a goal: users must accept certain conventions if they want code to behave the same in all situations.

For instance, some Common Lisp implementations re-expand the same macros during evaluation, which is friendly toward developers when they are modifying macros. But in compilation, macros are expanded once.

Usually these compilation-interpretation discrepancies are obscure; we are not talking about obscure features here. It's a basic tenet of type checking in a compiler that it will flag things that will throw type-related exceptions at run time. You cannot say that type checking compilers for dynamic languages are simply not allowed because type mismatches are well-defined behavior; that's simply outlandish.

1 comments

> The compiler gives an opinion "x has an inconsistent type".

> You can still run the program if you like; then that opinion becomes an exception if an input case exercises the code, proving the compiler's opinion right.

In this example, the compiler cannot give "x" the static type of "string" and also allow the program to handle errors when "x" is used in places where strings aren't allowed. If the static types don't match, the result is undefined; that's what static typing means.

If the compiler causes an exception to be thrown, then either:

- That's just an implementation-specific quirk, which just-so-happens to be the way this compiler behaves when it hits this undefined case; it's not part of any spec/documentation and useful only to hackers trying to exploit the system.

- We accept that the compiler didn't assign "x" the static type of "string" after all; it actually assigned it "string + exception + ..." (which may be the "any" type of the dynamic language, or some sub-set of it)

The situation isn't one of static typing, but of static type checking. Check (parent (parent (parent yourpost))) where it says

> What about compilers for dynamic languages ...

A compiler for a dynamic language that checks types does not bring about static typing. The language in fact remains dynamic.

What the checking means is that the compiler can give an opinion based on a static view of the program, and we can run that program regardless.

The compiler can say: yes, all expressions in the program can be assigned a type; no, some expressions have a conflicting type, or couldn't be assigned a type.

Static typing indeed means that we use the result of the static analysis to remove all traces of type from the program, and only run it when its type information is complete and free of conflicts.

The dynamic language optimizer can in fact take advantage of its findings to eliminate run-time type checks where it is safe to do so.

> What the checking means is that the compiler can give an opinion based on a static view of the program, and we can run that program regardless.

> The compiler can say: yes, all expressions in the program can be assigned a type; no, some expressions have a conflicting type, or couldn't be assigned a type.

What is a "conflicting type"? In the dynamic language, there is only one static type (any = int + string + float + (any -> any) + array(any) + ...), so there can't be any conflicts.

It's fine to have a compiler try to specialise the types of variables beyond that, eg. narrowing-down the type of "x" in the example to "string + float", based on how it's used (or just leave it as "any").

A "conflict" would imply that a variable is used in ways that don't match the inferred type; but the inference is based on how it's used.