Hacker News new | ask | show | jobs
by KMag 3554 days ago
This is great news, though it initially only supports Linux x86-64 and is decades late for Java desktop apps (and not having non-blocking I/O until Java 1.4 was shameful for a language explicitly targeted and a pervasively networked ecosystem.)

In their "tiered mode", they put sampling instrumentation into the native code, and if they detect a hotspot, regenerate fully instrumented native code from bytecode using the C1 (fast) JIT, which then allows the C2 JIT to do its full optimizations on the code as if AoT were not involved.

Since the invention of tracing JITs, I've often wondered why languages don't package together a compact serialized SSA form such as LLVM bitcode or SafeTSA along with storing functions as lists of pointers to space-optimized compilations of extended basic blocks (strait-line code), similar to how some Forth compilers generate threaded code. A threaded code dispatcher over these strait-line segments of native code would have minimal overhead, and when a simple SIGPROF lightweight sampler detected a hotspot, a tracing version of the dispatcher could collect a trace, and then generate native code from the visited traces using the stored SSA for the basic blocks.

In this way, they'd have a light-weight tracing JIT for re-optimizing native code.

4 comments

You might be interested in looking at Semantic Dictionary Encoding [1]. It was professor Michael Franz' PhD thesis work. Franz' was Andreas Gals advisor on his thesis on trace trees.

SDE didn't propose starting with SSA, but could easily work with an SSA representation. SDE basically functions as a compression mechanism for an semantic IR that builds a dictionary on compression/decompression reminiscent of LZW. So instead of storing straight byte code, you store a compact higher level representation, that could very well be SSA, that is structure for you to generate code while "decompressing" it, and reuse generated code fragments as "templates" for later fragments.

An implementation was built in Oberon, compact tree representation (you could do a DAG with some adjustments) that mirrors your code generation orderand e.g. used to support PPC and M68k from the same "binares" in MacOberon. The way it was structured makes retaining arbitrary higher level structure of the programs very straight forward.

I keep wanting to do something with SDE, but life keeps intervening... I see it as a huge shame that more work didn't go into exploring that alternative to straight up bytecode, but it basically had way too little head start on Java, and I believe Franz' moved to Java for his subsequent research on code generation.

[1] https://en.wikipedia.org/wiki/Semantic_dictionary_encoding

Yes, I'm familiar with SDE, but thanks for mentioning it. The SafeTSA I mentioned was one of Michael Franz's later contributions to the field. SafeTSA was an SSA representation capable of expressing all of the security and other semantic constraints of the Java language. Michael Franz's group took the Jikes RVM (then known as Japapeno) and added a second front-end to the JIT that could read SafeTSA, so they could test performance of programs running Java bytecode and SafeTSA in the same process. SafeTSA both took less time to go from bytecode to native code, but also the resulting native code ran faster.
Interesting. Do you have a link to the thesis? The link on wikipedia seems broken, and Franz's homepage doesn't seem to contain a link.
The name of the paper is "Code-Generation On-the-Fly: A Key to Portable Software" just search for it.
LLVM bitcode is still architecture specific - for example, whether the code is 64 bit or 32 bit will result in different bitcode paths.

You may be interested to look further into Eclipse OMR, which is a generic VM used by IBM for many of their runtimes (including J9). The Testarossa JIT support landed last week, and although it doesn't support bitcode form directly there are optimisations that can be used to transform the static parts of the class from the dynamic parts, to facilitate loading. There is an IL for the JIT and interpreter use.

https://developer.ibm.com/open/omr/

Thanks for the pointer!

I (and others) have noted that for more than a decade, it seems that Java would have been better off under IBM than under Sun/Oracle (SWT vs. Swing/AWT, jikes vs. javac, Jalapeno/JikesRVM vs. not much interesting research until Graal, etc.) It's really a shame IBM didn't buy up Sun's Java intellectual property at fire sale prices.

This could be taken even further, if the IR can hold about effects and purity, etc you could potentially optimize across libraries and binaries.
> decades late for Java desktop apps

Commercial JDKs always offered AOT compilation, the problem is that people nowadays apparently don't buy compilers anymore unless forced to do so (e.g. embedded, consoles...).

Those are priced for people who already made a big investment in writing their application in Java and now realise they need features not present in javac. If you're just starting out, it can very well make more sense to use Microsoft Visual C++, which costs less than a commercial Java compiler and comes with an IDE that's light years ahead of anything available to Java developers.

Desktop Java also had many other problems, which can be summarised as "the JVM is its own OS". You can't write an application in Java that has a native look and feel. Or at least you couldn't for the first several significant years of its life and even now I don't think there's a good story for writing a simple native application. Meanwhile you could grab wxWidgets or Qt (and there goes your budget for a java compiler) and have a native-looking cross-platform application. Which very few did, because back then Mac OSX didn't exist, Apple were on their death bed and "Linux Desktop Environment" was even more of a joke than it is today.

So yeah, it didn't make any bit of sense to develop Java desktop apps given that you already had a large pool of proficient C++ developers, the only platform you cared about was Windows and Java GUI libraries insisted on reinventing their own look and feel. Oh and you could always just buy Delphi if you didn't want to suffer C++ (again, for a fraction of the price of a commercial Java compiler).

Nowadays people wrap a bunch of javascript in an electron instance, but this only happened after the web took off and nobody really looks at native desktop apps much. If this AOT work can give us fully contained native executables that we can distribute without having the user install Java and with significantly better performance than nodejs, maybe Java on the desktop can still happen.

That's not correct. The first UI toolkit Java had was AWT and it mapped through to native widgets. AWT was not very successful because it tried to be cross platform rather than a direct mapping of the Windows UI toolkit, which was significantly more advanced in that era than its competitors MacOS Classic and - most problematically - UNIX workstations, which had truly miserable UI toolkits. So AWT was limited to the lowest common denominator and trying to abstract UI libraries didn't work very well, the abstraction was leaky.

So for the first few years of Java's existence developers were given native UI, and said no, actually, we don't care if we have a native look and feel or not - for the kinds of line-of-business apps they were writing a powerful and consistent toolkit was more important than one that looked the right shade of grey. Hence, Swing.

Nowadays if you want to write a small, pure native Java app with native widgets you can do it with SWT and Avian. There's an example here:

  https://readytalk.github.io/avian/
It demos all the features available in SWT with a 1mb download that's fully self contained. You still have the problem of leaky abstractions and SWT apps don't look entirely normal, as some more complex widgets still need to be custom, but it's another attempt at AWT that works significantly better as MacOS and Linux closed the gap with what Windows could do, so you can have a richer abstraction.
> If you're just starting out, it can very well make more sense to use Microsoft Visual C++, which costs less than a commercial Java compiler and comes with an IDE that's light years ahead of anything available to Java developers.

Sorry but you are way wrong.

I do consulting in Java, .NET and C++ eco-systems, and started using C++ back in 1993, when C++ARM was the only reference for a possible future standard.

The only C++ IDE that for many years could match the Java ones is C++ Builder.

Visual C++ only started to match C++ Builder now with the C++/CX + XAML, for the WinRT applications.

And while Visual C++ debugger and code navigation are quite good, they still don't rival Java IDEs or even their own .NET experience, without installing something like Visual Assist or ReSharper C++.

> You can't write an application in Java that has a native look and feel. Or at least you couldn't for the first several significant years of its life and even now I don't think there's a good story for writing a simple native application.

Sure you can, but developers seem not to like to read books, so they just write crappy Java desktop applications without learning how to use Swing.

https://www.amazon.de/dp/B004Y4UTHM/ref=dp-kindle-redirect?_...

> Meanwhile you could grab wxWidgets or Qt (and there goes your budget for a java compiler) and have a native-looking cross-platform application. Which very few did, because back then Mac OSX didn't exist, Apple were on their death bed and "Linux Desktop Environment" was even more of a joke than it is today.

We were targeting UNIX with Motif++ back in those days.

Regarding Windows, OWL and later VCL were way better than anything that Microsoft produced for C++. Even XAML was initially targeted to .NET.

As for Apple, we were mainly using Metrowerks with PowerPlant.

> If this AOT work can give us fully contained native executables that we can distribute without having the user install Java and with significantly better performance than nodejs, maybe Java on the desktop can still happen.

There are many applications that people aren't aware that are actually compiled with ExcelsiorJET.

As I said, this generation doesn't pay for compilers.

There has also been gcj for a long time, but default toolchains matter a lot.
I seem to recall that GCJ had a lot of limitations back in the day - you couldn't use the same standard library, it didn't support newer language features, etc. Wouldn't surprise me if even now GCJ had poor integrations with IDEs and other essential tools to make Java livable.
GCJ was abandoned in 2009 as the majority of its devs left to either work on Eclipse's compiler or early OpenJDK days.

It is still available on gcc, because of its unit tests. Some gcc code paths are only used by gcj.