Hacker News new | ask | show | jobs
by srean 4576 days ago
This was a good read. Minor nitpick, given the author's stress of not comparing apples to oranges I think the motivating anecdote on protein folding struck a dissonant cord. A fair comparison would have been a profile guided optimization with comparably aggressive compile optimization flags.

For these types of problems the profile characteristic is fairly static, so there is hardly ever a need to pay for the warm up time.

Java can indeed be very fast, and in my opinion JVM is the one of the most optimized VMs that we have now. That said Java code beating a C or a C++ code on a CPU bound task raised an eyebrow. In my experience Java usually gets to the 80% speed of C++ quite easily but at a cost of 2.5 to 3.5 the memory. For these types of applications memory tends to be an expensive resource. For our number crunching server, 80% of its cost is sunk in the RAM.

2 comments

My experience is similar: Java will get you 80% of super-optimized-compiled C++ speed for the cost of more memory (although you may choose to give Java less memory and pay in speed), and will get much closer to C++ speed with some work (off-heap memory, etc.). However, Java has the upper hand in two scenarios: 1) long-running applications that are developed by a large team – those usually make heavy use of virtual inheritance for the sake of good engineering, and the execution profile is not static, so the program can greatly benefit from JIT optimization; and 2) fairly complex multithreaded code – Java's incredibly useful and well-implemented blocking and lock-free data structures, as well as an incredible work-stealing scheduler (all expertly programmed by Doug Lea) make great use of a general purpose GC, plus Java usually gets them about 5 years before C++ (if C++ gets them at all).
Don't forget the tools and infrastructure that support good engineering (dependency management, build, javadoc, findbugs, checkstyle, PMD, cobertura for code-coverage, better unit-testing frameworks, the list goes on).

I'm sure C++ ecosystem has some of those but whether people are using them or not seem quite obvious: how many C++ projects do that out there? Not as many as the Java counterpart.

> how many C++ projects do that out there? Not as many as the Java counterpart.

I'm not sure this is a good argument.

Projects where performance is critical are coded in C/C++ , not in java ( browsers , AAA games , databases , servers , micro-controllers , oses... ). C/C++ programming is not about writing 'elegant' code with FactoryFactories , but performances. Even if it means using very basic data structures instead of classes or inlining functions everywhere.

> Projects where performance is critical are coded in C/C++ , not in java ( browsers , AAA games , databases , servers , micro-controllers , oses... ).

This is not true. C++ beats Java performance in constrained environments. On servers the situation is not so clear-cut. You see some really fast Java databases.

And BTW, Java is better at inlining functions than C++, but C++ handily beats Java when it comes to controlling memory layout (though that, too, is changing).

> This is not true. C++ beats Java performance in constrained environments

Every environment is constrained. Devs just got lazy and scaling now means buying more machines instead of performance optimisation.

> You see some really fast Java database

That eat up way to much memory for little.

When you see popular java based os , tell me.

I'm referring to the tools that ensure the code quality, not the performance critical aspect of it.

Besides, all of the projects I've worked for has no FactoryFactories so let's cut that song right here right now.

> Projects where performance is critical are coded in C/C++ , not in java (databases)

HBase, Cassandra, VoltDB

I believe the core of VoltDB (the in memory storage engine) is written in C++. See here: https://github.com/VoltDB/voltdb/blob/master/src/frontend/or...

"VoltDB mainly consists of Java modules for easier development. However, the core EE and its underlying on-memory storage system is implemented in C++. Nowadays Java and C++ have almost same performance for many cases, but C++ still outperforms Java on low-level memory accesses. This is why we have EE and its storage system written in C++."

Actually, my favorite bits are runtime linking, JMX, VisualVM and, most recently, FlightRecorder + Java Mission Control, which has to be one of the coolest profilers I've seen. But I was just talking about performance.
Don't forget the javaagent and the ability to instrument JVM. Event .NET/CLR doesn't have that (They have some low-level COM/Profiler API but boy that takes a lot of effort to work with).
How could I forget? I personally use runtime instrumentation all the time: https://github.com/puniverse/quasar
Sorry I don't mean to hijack thread or anything like that but I work for AppNeta and we're on the APM field instrumenting various platforms. If you ever need an instrumentation tools for your full-stack infrastructure, check us out :).
Almost agree, but minor nitpick: the tools and infrastructure (at least most of them) exist on other platforms in the same or better quality as well. Especially dependency management is not necessarily a strong side of Java (one global namespace for types etc.).
> one global namespace for types

I am not entirely sure what you mean by that, but in Java, a runtime type is determined as a combination of its name and its class-loader. You can (and do) have two versions of the same class running alongside one another in different modules of your code.

Granted, supporting this stuff is not easy, but hopefully the long awaited module system planned (currently...) for Java 9 is supposed to make this all both powerful and easy.

Well, the OSGi module system supports this now and is quite powerful. Oh wait, you also said easy.
I debated including that example, but at the end the point I was making is that the JVM can be faster than native code in some cases... this case had static allocation of data and the calculations could benefit from MMX instructions when available, plus there was a logical unrolling that I think it was able to infer from code path analysis
And superword optimization, when possible...