Hacker News new | ask | show | jobs
by drdrey 3389 days ago
Off the top of my head:

* JVM bytecode is stack-based, whereas LLVM uses "infinite registers" in SSA form

* being in SSA form makes it convenient to consume in compiler passes, but comes with quirks that mean you don't really want to write in that style manually: the mind-bending phi instructions, definitions must dominate uses, simple ops like incrementing a variable really means creating a new variable, etc.

* JVM bytecode carries a lot of Java-level information, for instance if you have N classes with M methods each in source, you will typically find N classes with M methods in bytecode too. A lot of keywords in Java have an equivalent in bytecode (e.g. private, protected, public, switch, new...)

* in contrast, LLVM IR feels closer to C (it only knows about globals, arrays, structs and functions). It exposes lower level constructs like vector instructions, intrinsics like memcpy

* JVM bytecode is well specified: anyone armed with the pdf [1] can implement a full JVM. LLVM IR is somewhat loosely defined and evolves based on the needs of the various targets

* JVM bytecode is truly portable, whereas target ABI details leak into LLVM IR. A biggie is 32 bit vs 64 bit LLVM IR.

[1] https://docs.oracle.com/javase/specs/jvms/se8/jvms8.pdf

1 comments

> JVM bytecode is truly portable, whereas target ABI details leak into LLVM IR. A biggie is 32 bit vs 64 bit LLVM IR.

From a few LLVM meetings and Apple's work on bitcode packaging, I think there is some work to make LLVM IR actually architecture independent.

PNaCl is noteworthy here too, insofar as its IR is very much based on LLVM IR (but portable and stable).