Hacker News new | ask | show | jobs
by jmreardon 5470 days ago
High object churn shouldn't be a problem for the JVM, whether it is running Scala or Java code. Allocation lots of little objects and then discarding them shortly after is a use case the JVM is well optimized for. You would have to allocate an extraordinary number of such objects to see any real slowdown.

The Oracle JVM uses generational garbage collection[1]. Glossing over some details, this garbage collector divides objects into generations depending on how long they've been around, and also keeps track of references that cross generations. Objects that are newly created go into the first generation. If the objects don't refer to objects outside of the first generation, it is very cheap for the VM to determine they are in fact garbage. When the garbage collector gets run, objects that a still live get copied to another generation. Once the still live objects are copied out, the VM doesn't need to do any cleanup to get rid of the dead objects, it simply marks the whole memory block the first generation was in as available.

[1] Not that other JVMs don't, but I only know the official one.

3 comments

Oracle have put an enormous amount of work into the JRockit GC to support determinism. It's very clever: you can say to it, no GC pauses longer than X milliseconds, and it will adjust its throughput to guarantee that. So you might get lower overall performance, but it will be predictable.

      simply marks the whole memory block the first 
      generation was in as available
That doesn't make any sense.

On that GC run what if you just allocated an object, 0.01ms ago? You mean to tell me that it's either (a) deallocated before being used or (b) moved to the second generation already? Also, you need to basically stop the world for doing first-generation cleanup, as you described it, otherwise you're running into race-problems. That's insane.

Glossing over some details, no matter how efficient the GC is in regards to first generation treatment, you're still accessing the heap, which is an expensive operation, you're still potentially boxing/unboxing primitives in loops and you're also making single-dispatch virtual calls on those short-lived objects.

C++ has terrible performance when using heap-allocated values, but with stack-allocated objects C++ kicks Java's butt. And that's not the only potential problem for Scala, unless Scala adds some kind of tracing compiler that can avoid boxing/unboxing and runtime-dispatch when not necessary.

I can tell you is isn't being deallocated before being used, that would be a major bug. So, I guess it would get moved, but the GC probably won't run unless the first gen space is pretty full. I'm glossing over details because, well, the GC in the JVM is a very complicated piece of work, and I don't know much of the details, but the JVM does in fact default to a stop-the-world garbage collector[1].

I don't think anyone actually expects the JVM to beat C++ using stack allocated objects[2]. Our concern is how well a garbage collector handles lots of small objects.

[1] Details on how the JVM GC works: http://www.oracle.com/technetwork/java/gc-tuning-5-138395.ht... [2] There is a switch, which will be on by default one day, that lets the JVM do escape analysis so it can allocate objects on the stack automatically.

It all depends on the context. In code that is not performance-sensitive, a bit of object churn isn't a problem.

However, there are places where a lot of object churn can really be problematic. In particular, I find that Android applications tend to be somewhat sensitive to this. However, even in desktop or server applications, a lot of object churn in a tight loop can be bad.

That's because when people talk about the JVM's performance, they refer to Oracle's Java SE, not Android and not Apache Harmony.