Hacker News new | ask | show | jobs
by leetNightshade 3885 days ago
You assume that JVM is slow, yes? That's not always the case. Interestingly, there's cases where JVM applications run just as fast as if not faster than native code. This blows my mind, as a C++ programmer myself.

http://codexpi.com/java-vs-cpp-performance-comparison-jit-co...

http://stackoverflow.com/questions/5641356/why-is-it-that-by...

http://beautynbits.blogspot.com/2013/01/performance-java-vs-...

1 comments

Once compiled to native code, which it will be for big data because the same classes are reused over and over, I would assume it would be in same ball-park as C/C++ code.
There's still a pretty big speed penalty for Java because the object model encourages a lot of pointer-chasing, which will blow your data locality. In C++, it's common for contained structs to be flat in memory, so accessing a data member in them is just an offset from a base address. In Java, all Object types are really pointers, which you need to dereference to get the contained object. HotSpot can't really optimize this beyond putting really frequently used objects in registers.

A lot of big-data work involves pulling out struct fields from a deeply nested composite record, and then performing some manipulation on them.

Listen to the parent here, I've seen 10x performance in production Java code just using flatbuffers(and paying the marshaling costs from ByteBuffer).

50x is not unreasonable for C/C++ code that was OO and uses a data oriented approach instead.

Memory indirection is the biggest issue indeed. However, I'd also add that java has a terrible performance model, as a language. Unless you stick to primitives only, the abstraction costs start to add up (beyond pointer chasing). It shoves the entire optimization burden onto the JVM which by the time it runs has lost a bunch of semantic and type information in some cases. There are also codegen deficiencies in current hotspot C2 compiler (i.e. generated code subpar compared to roughly equivalent gcc).
> In C++, it's common for contained structs to be flat in memory, so accessing a data member in them is just an offset from a base address

JVM inlines virtual method calls as one of its optimizations. See: http://www.oracle.com/technetwork/java/whitepaper-135217.htm...

How is that related to the parent's point?