That's the short answer, but to elaborate, it probably has more to do with the culture surrounding the language than the overhead of the JVM, although that also plays a role. What I mean by "culture" is the widespread notion (cargo-cult?) among Java programmers that adding more classes and abstraction is always better, almost like a "best practice", leading to "enterprise" monstrosities with deep inheritance hierarchies and ridiculous amounts of indirection to accomplish the simplest of tasks. The low consideration given to memory management in general (there's a GC, but that doesn't mean you should abandon all thought about memory allocation --- an analogy I like to use is how it's possible to get as good fuel economy with an automatic transmission as a manual, with the right technique) also contributes to the bloat.
The JVM itself has a certain amount of unavoidable overhead, but even if it was e.g. 10x slower than native code at best, I don't think that's the main problem. I've used systems that were more than 10x slower in benchmark-terms and had less than 1/10th the memory, yet felt much more responsive and performant. The problem is the culture that encourages this massive resource waste and selfish conservation of developer's time --- at the expense of everyone else.
>The JVM itself has a certain amount of unavoidable overhead
That is not a "certain amount of overhead" but the inherent incompatility with the modern hardware. With Java writing cache-friendly code is extremely difficult: boxing and indirections are encouraged while primitive types are cumbersome and value types are possible only through the direct byte manipulation. Memory overhead is enourmous. A simple collection like a hashmap of short strings can have up to a 75% overhead.
Didn't want to say it outright, but that's my theory too. Having a VM manage everything in a resource/power constrained environment is a crazy idea in the first place. Oracle's JVM is competitive because of heroic engineering, in spite of Java's design -- not because of it. And it still has tradeoffs, like insane memory usage. ART/Dalvik are operating under different constraints, which probably contributes significantly to Android's handicap.
That can't be the whole story though, because C# and VB.NET both (seem to) perform decently under a managed runtime on Windows Phone. Wonder how big of a role the CLR has in typical WP apps and the WP core, as opposed to unmanaged C/C++ code.
The Go authors had a pretty good article on what's wrong with Java's performance : pointers everywhere. Every last little thing that isn't a primitive type is a pointer. Everywhere, in every bit of code.
That means a "new Object()" takes up 16 bytes (8 bytes for the object, 8 for the pointer to it). That means you fill a cache line by allocating 4 objects, or 2 objects containing a single reference, or ...
So in java you should never program a line drawing loop by using 2 vectors, because 2 vectors, each with 2 32-bit ints take up 82 (2 pointers to the objects you're using) + 82 (overhead for the objects) + 4*2 (the actual data) 40 bytes of data. No way you can fit that in registers and still use registers to actually calculate things. So instead you should use 4 ints and just forget about the objects, and even that will only work if you never call any functions.
Same loop in C/C++/Pascal/Go/... using structs takes 8 bytes (they don't keep structs on the heap), which, if necessary, fits in 1 register (granted, in practice we're talking 2 registers, but still).
People might reply to this with benchmarks, but if you actually analyse the java code where java beats or is comparable with C/C++ you're going to see zero object allocations. You're not even going to see them using bool in the extreme cases, rather they'll bitshift into ints to effectively generate packed bools (certainly in SAT benchmarks). This is not realistic java code, which would have been way slower.
Java's memory model is the main culprit at this point in time. Java can do incredible tricks with programs, and actually exposes them, enabling lots of language creativity on the JVM. But there's a pretty sizeable cost in speed and memory usage.
People might reply to this with benchmarks, but if you actually analyse the java code where java beats or is comparable with C/C++ you're going to see zero object allocations.
I've noticed that tends to be true in general for benchmarks of high-level languages which show them performing as well as or sometimes even better than C/C++ --- the code performs so well because it's essentially using none of the other language features that most code in the language does. I touch upon this in my other comment here about culture: the language theoretically allows you to write quite efficient code, but it doesn't look "idiomatic" or perhaps isn't a "best practice", so it's discouraged and isn't done. The entire dogma of avoiding any optimisation compounds this problem even more, since once programmers finally realise they have performance issues, they've already created such complex and inefficient code that it's even harder to do any optimisation on.
On the other hand, idiomatic C tends to be written in a simple and straightforward style that is naturally quite efficient already. C++ is similar, although templates, OOP, and all the other new features can lead to inefficient code if not used in moderation.
I suppose the ultimate example of what could be called "intrinsically efficient" is assembly language. With Asm, every instruction, every byte you can save from typing is one the machine also doesn't have to execute, so you're basically forced to optimise as you write. There's certainly no desire to overengineer things, simply because of the extreme tedium and futility of doing so. With no IDE to help you generate classes and autocomplete indirections, it really changes your perspective of what constitutes efficient code.
Only in JVMs and AOT compilers that don't do escape analysis.
Also don't forget Smalltalk, which also only does references, was running in the Alto, Dolphin, and Dorado workstations.
For example the Dorado was:
- 128-512 kB
- 606x808 pixels
- 4 74181 CPUs
So how does that compare to a beefy Android device?
Also J2ME and Embedded Java are running quite well in many embedded platforms, in a few hundred KB steering soft real time systems like robots and missile radar controls.
So yes, Java might not offer all the memory control features that other GC enabled languages do, going back to Algol 68, Mesa/Cedar, Eiffel, Modula-3, ....
But given the performance of commercial JVM vendors, I would say Google has a lot of blame as well.
EDIT: Forgot to add that when Java 10 comes out with value types and reified generics (according to the roadmap) this will become a moot point, except of course for Android Java given Google's unwillingness to provide support for the real thing.
"Whenever a new class instance is created, memory space is allocated for it with room for all the instance variables declared in the class type and all the instance variables declared in each superclass of the class type, including all the instance variables that may be hidden (ยง8.3)."
Doesn't seem to allow for escape analysis eliminating the object. Plus escape analysis wouldn't really save you. These are class instances, you pretty much have to declare them before the scope you use them in, if you're using them in the condition of a while loop (which would be the way to use them).
I seem to have this experience in practice. If you have a value type and loop over it, creating a "dummy" instance of it outside of the loop, then erase and reset it's inner state on every loop iteration is far faster than creating an instance inside the loop. So I don't think escape analysis optimizes this case.
From their Google IO presentations and their atitude towards NDK users vs how other mobile tems deal with their devs, I would say Java runs strong within who calls the shots at the Android's team.
Even if some of the code looks like written by devs recovering from years of exposition to hungarian notation.
The NDK is not part of their priorities indeed (although to be fair it seems that things are slowly getting better with a team dedicated to integrating Clion).
They are smart engineers though and I have no doubt that if C++ had been the best choice for the platform, we would not be writing apps in java ...
tbh, I am really tired of the simplistic 'because java' argument with nothing to back it up ...
I have no love for the language (although I think it gets more flak than it deserves) but I have spent a lot of time working on the performances of Android apps and none of the issues I have fixed would have been any different in another language.
I would also used Java if Oracle hadn't dropped the ball in mobile support, as if they couldn't provide JIT and AOT compilers.
So given that I enjoy C++, when conding on my own, that is what I end up using for hobby coding between mobile platforms. But the NDK and JNI wrapping take the fun out of it.
I am curious : on what kind of mobile apps are you working on your free time ?
By design, the NDK can only access a very small part of the platform APIs.
It is not an issue if you are making something where you are supposed to use the NDK (like a drawing app or a game), but for a 'traditional' app, that's another matter.
The JVM itself has a certain amount of unavoidable overhead, but even if it was e.g. 10x slower than native code at best, I don't think that's the main problem. I've used systems that were more than 10x slower in benchmark-terms and had less than 1/10th the memory, yet felt much more responsive and performant. The problem is the culture that encourages this massive resource waste and selfish conservation of developer's time --- at the expense of everyone else.