Hacker News new | ask | show | jobs
by ralusek 2540 days ago
I think most people consider Java to be an interpreted language, it's interpreted by JVM. It's obviously somewhere in between, in the same way that JIT languages are, but it's still not native.
1 comments

Not sure about the BEAM, but Java isn't considered an interpreted language at all.

First, discerning the compilation step to IR is relevant, which is why for example, Python can be compiled or interpreted. That distinction matters. Even though it is compiled to intermediate representation, it is still compiled.

Now, with regards to the intermediate representation, Java Byte Code is also not interpreted. Java Byte Code is compiled to native machine code either Just in Time by the JVM, or Ahead of Time by the SubstrateVM. Both of these make it a compiled language.

An interpreted language never gets translated into native machine code, it gets executed by a native interpreter, and that's very different. It is more akin to using a language where you read and parse texts, and based on the text, you might execute one or more things. Now if that parsing to execution branching is powerful enough to allow Turing complete behavior, you have yourself a full blown interpreted computer language. This is not what happens in the JVM. The JVM translates Java Byte Code to native machine code Just in Time, and then the native code is run.

We should be more precise: the JVM may or may not have an interpreter. This is implementation dependent. There are JVMs that are interpreted fully and AOT compilers as well.

Hotspot includes an interpreter -- your code may be interpreted until it gets hot and gets compiled.

Still, I don't think it's fair to call Java an interpreted language since the parts that are interpreted are only during warm up or slow enough not to matter.

Yes you are right. Someone could argue Java Byte Code is interpreted, but I think colloquially, that's leading people to misunderstand the nature of modern JVMs, which as you say, will choose to either run the code in an interpreted manner, or compile it first (possibly with optimizations) dynamically at runtime. Those choices are made based on what will result in best performance and safety.

Java on the other hand is 100% compiled (to bytecode). And byte code is machine code for a non existent machine. Actually, I think Sun had built machines that could natively execute Java Byte Code. Someone could build an interpreter for it, but I'm not aware of one.

Java is an interpreted language because you always need a JVM to run it. I even if you distribute the runtime with your code as one package, that’s just an artificial distinction.

Regardless, interpreted does not mean bad, or poor performance for your task.

It’s a technical notion.

Most people would say Python is interpreted, even if you are only running compile python bytecode on a JIT like PyPy. That’s exactly the same situation as Java.

Java is compiled to Java byte code. It cannot be interpreted in its source form. There's nothing preventing it to be, but as far as I know, there are no interpreters for it. Python on the other hand can be interpreted from source, and that's the default behavior, making Python an interpreted language.

This makes Java a compiled language, even though it is compiled to machine code for a machine that doesn't exist. Java Byte Code is an assembly language, and in fact, Sun had at one point machines that were natively using Java Byte Code instruction sets.

Now Java Byte Code is trickier. You could consider it to be interpreted or compiled or both. It is fair to say that in general it is interpreted, but it is just as fair to say it is in general compiled.

What matters most though is to understand that modern JVMs make use of a JIT compiler and optional AOT compilers.

In the latter mode, you can compile Java Byte Code to native machine executables ahead of time or pre-runtime and no JVM beyond that point is required. So you can distribute the app as a self-contained executable. In most cases though, this will be less performant in the average and peak performance, but it will speed up the worst case, such as start time.

In the former mode, the JVM will analyze runtime behavior and based on the frequency of use of various code paths, it will either compile the code to native machine code (cache the compiled code), and then run the freshly compiled code block, or it will choose to interpret the byte code directly. This allows aggressive optimization from information which is only available or easily available at runtime when performing compilation.

That is incorrect. I don't know of any mainstream Python implementation that has ever interpreted the code directly, except perhaps in a context like evasl(). CPython has all ways compiled to byte-code first.

Actual sourced-based interpretation is _very_ slow, and pretty rare in anything that sees meaningful real world use. Ruby was, back when it was an order of magnitude slower than Python, but no other example jumps readily to mind, and Ruby went bytecode with the release of 1.9 in 2007.

Ah you are right, I didn't know that about Python. Since it can compile to IR on the fly though, I think it gets more tricky as well. Probably fair to consider that interpreted or compiled both. Like they say in their doc:

> Python is an interpreted language, as opposed to a compiled one, though the distinction can be blurry because of the presence of the bytecode compiler. This means that source files can be run directly without explicitly creating an executable which is then run.

OpenJDK can't do that. You need to pre-compile the source to byte code first.

There is nothing about Java that requires a VM. That is just how Sun decided to implement it.

GCJ, for instance, allows AOT compilation of a Java program that requires no JVM.