|
|
|
|
|
by giltene
3334 days ago
|
|
I'm in no way arguing that Intel and others do not also contribute to HotSpot. It's just that MUCH more attention and seems to be paid to LLVM, and the amount and impact of contributions there are bigger. As to the AVX2 work contributed: there are all sorts of "easier to integrate" AVX2 uses that had been added to HotSpot over time (especially with intrinsics, which tend to be nice and isolated from the rest of the code gen), but the vectorizer in C2 is not nearly as capable as the one in LLVM, and the results speak for themselves. And I expect o see a similar gap open when AVX512 becomes available on Skylake EPs soon. On the "how many contributions directly applicable to Java vs. C++ question": we were surprised at how readily those "C++ focused things" apply to idiomatic java code. The LLVM contributions can probably be better describes as "expecting certain patterns and idioms that are common in C/C++". And once you bring the Java code to LLVM IR form and "remove the breaks" as much as possible (e.g. fix how things like safepoints, GC barriers, and depot points used to defeat most optimizations), the patterns seem to be there in the Java code as well. The vectorization u-bench thing here is certainly a good example of that in play. We didn't massage it to find the vectorization opportunity. I didn't even get specific directions or tips on "write it this way". I just wrote some "I think this loop should be vectorizeable with AVX2" pieces of java code and the existing vectorizer seemed to have just picked them up and had its way with them. |
|
Yes, I can well believe that LLVM has put a lot more effort into auto-vectorisation than HotSpot has, given the relatively different ways in which Java and C++ tend to be used. And I don't see much effort put into auto-vectorisation in Graal either, from my occasional browsings of the commit logs. The research in HotSpot JIT compilers seems to be focused more on compiling more dynamic languages like JavaScript and Ruby faster, rather than C++ style numeric code: I guess reflecting a focus on business apps that probably don't contain many hot array loops relative to scripting language use.
I look forward to benchmarks. The blog post contains a graph that purports to compare HotSpot vs Zing but there is no Y axis, so I guess it's meant for illustration of the basic idea than actual performance comparisons.