Hacker News new | ask | show | jobs
by dave809 4530 days ago
An interesting post on c++ like memory management in java:

http://vanillajava.blogspot.ca/2014/01/sunmiscunsafe-and-off...

1 comments

This is also a good post, along the same lines: http://mechanical-sympathy.blogspot.com/2012/10/compact-off-...
The most interesting quote:

"Lately, I was comparing standard data structures between Java and .Net. In some cases I observed a 6-10X performance advantage to .Net for things like maps and dictionaries when .Net used native structure support."

The ".Net" can be much faster than Java because Java has to have the things on the heap which can be on the stack.

> The ".Net" can be much faster than Java because Java has to have the things on the heap which can be on the stack.

This is a famous fallacy (except in some very specific circumstances). The reason for the performance difference is probably due to packed structures and better CPU cache behavior.

Stack/heap makes very little difference in most circumstances for two reasons: first, Java heap allocation is as fast as stack allocation – it is a thread-local pointer bump. Deallocation for short-lived objects is free. You do incur an indirect cost of triggering a young-generation GC which causes a pause linearly related to the size of new-but-not-so-young objects. These pauses add up to very little (under 1% of total time). The second reason is that stack memory is usually a very small percentage of the total memory for interesting programs. Programs manipulate data. Non trivial programs usually manipulate lots of it, otherwise we wouldn't need so many gigabytes of RAM. A thread's stack is rarely over a few megabytes in size, so to have a significant portion of your data on stack you need tens of thousands of threads which the OS can't handle anyway.

To sum up: 1) Java's handling of short-lived memory is extremely fast; 2) all interesting data lives on the heap anyway.

Stack allocation can make a difference in reducing the number of young gen GCs, but it's not a huge difference in most circumstances. Packing your heap data structures better is more important.

> pause linearly related to the size of new-but-not-so-young objects

Stop and copy still has to traverse all objects that contain pointers, even ones in the nursery (otherwise you'd incorrectly free objects in the nursery that are only referenced by other objects in the nursery). So the amount of scanning you do is proportional to all objects in the nursery (also the size of your stack/root set), and only the copy is linearly related to objects to be tenured.

> The second reason is that stack memory is usually a very small percentage of the total memory for interesting programs.

A small percentage of the long-lived memory, sure. But programs spend a lot of their time working on ephemeral data, and this is precisely why the generational hypothesis holds. There's no faster nursery than the stack: allocation fits nicely into the function prolog and deallocation requires no traversal of interior pointers.

I hear this sort of claim over and over again. Yet whenever I've had to optimize Java code that does heavy math (particularly work with vectors, matrices, complex numbers, etc) the number one optimization is to more or less maintain my own small object stack, or manually put aside and reuse objects within individual functions. We're talking 10x speed improvements by making sure that garbage created per-iteration is near zero as opposed to just making a mess and letting GC sort it out.

These are objects that have basically no-op constructors, and never stay live for more than a method call, which makes me think that the observed speed boosts must have something to do with GC. Am I missing something?

Yes, vectors is one of those cases where stack allocation has a significant advantage. This will hopefully be resolved (along with cache related issues) when value types are introduced, hopefully in Java 9.
Sadly I'm not holding my breath. For the past ten years we've been upvoting requests for value types, being told that they were not necessary because the JVM was brilliant already, then after much argument and ample proof being told that stack allocation in the JVM was a better solution, then having it delayed, finding out that the implementation didn't really work that well, etc. Sun has been pretty shit about prioritizing this issue, perhaps Oracle will be better, but it's a far shittier company than even Sun was so I'm skeptical.

It's also possible I'm jusy raging because this is a problem that has been well known for well over a decade but has been played down by the folks in charge of the JVM every time it comes up, even though it remains a real problem with the platform.

By having most frequently used data grouped on, as you note, "small" stack instead spread all over heap you do have much better cache use. Moreover, even if the allocation is fast, if there are a lot of them, they just add up up vs. the case where there is nothing to be done when everything precalculated during the compiling phase.

If you haven't witnessed these effects I suspect it's only because you haven't had the reference to compare with.

I believe that when people complain about "lack of stack allocation" in Java they sometimes mean "lack of value/non-reference types," which in other languages (at least C#) are frequently but not always stored on the stack.