Hacker News new | ask | show | jobs
by anthony88 1116 days ago
Looking at the implementation, it is implemented in C++. I'd rather have an implementation using the new Vector API (still in incubation). It would be more Java like and provide a good demo for this new API.
4 comments

It seems like you're saying you'd rather have a slower implementation given that a bunch of single instructions useful for this sort of thing aren't available in the Vector API and must be built from sequences of Vector methods that themselves must be implemented using multiple instructions.
I think he's referring to something similar to what .NET has been doing in the last few versions. They introduced a new Vector API that abstracts platform-specific SIMD instructions. The end result is the same, code using Vector128 will be directly compiled to equivalent AVX opcodes on x86/x64 and NEON on ARM* as if you would have written that directly, except that now you can add these kinds of optimizations across many architectures with a single codebase

This [0] post by Stephen Toub goes in GREAT detail on that

[0]: https://devblogs.microsoft.com/dotnet/performance_improvemen...

*I may get vector length wrong, but you get the idea

You can just look up the IntVector API in the docs and see that there's no method corresponding to VCOMPRESSPS or whatever.
You mean this IntVector[0], which I assume is the Java experimental API anthony88 was referring to, correct? If that operation being missing is a blocker, I feel there may be some middle ground other than implementing the whole thing in C++ (like adding it or fast tracking work on this API)

[0]: https://docs.oracle.com/en/java/javase/19/docs/api/jdk.incub...

The compress operation being missing is not a blocker, since the compress operation is not missing: https://docs.oracle.com/en/java/javase/19/docs/api/jdk.incub...
I don't get the other user's point then
This is an intrinsic JVM function though, not application code.
With the Vector API comming I don't see this PR going through. A light search on the OpenJDK mailing list found no discussion.
They just copied a library provided by Intel themselves.

The folks on the JDK side probably didn't even research how to parallelize sort.

The author of the PR works for Intel, though.