Hacker News new | ask | show | jobs
by cogman10 63 days ago
> I still don't get why Java is the only language that needs the heap to be carefully tuned.

Only tuning you should be doing is setting the heap size and algorithm (Though, size is likely enough).

> Like it hogs some memory at start, crashes if you go above a certain amount, and doesn't return memory to the OS when GC'd. Even Python and JS don't have those problems.

Unlike Python and I believe most javascript engines, the JVM uses moving garbage collectors. That's the primary reason why it hogs memory.

In these ready to return to OS languages, when something is freed or allocated they are literally calling "malloc" and "free" directly. That's why stuff tends to return back to the OS faster.

The JVM doesn't do that. When a GC runs in the JVM, the JVM picks up and moves live data to a new location. That means the JVM needs a minimum amount of free space to operate. The benefit of this is the JVM can allocate really fast, it's just a single pointer bump and a check to ensure there's enough space. It's pretty close to the same performance as the stack is in C++.

And if there's a lot of data that lives for a short period of time, the JVM frees that data very fast as well. There is little accounting that the JVM has to do to free stuff up because it's simply moving the live data.

For even the fastest allocators that python/javascript engines use, this isn't true. They have to keep track of the various allocation locations and the gaps in allocation when something is freed. And a request for allocation needs to ultimately find a location in the heap with enough room.

Java does have a memory issue, though, all objects in java are pretty bulky. This will hopefully be fixed in future versions when "value" types are added.

1 comments

Thanks, that's a really good explanation. And makes sense, especially since Java objects are all constantly getting allocated/freed on the (virtual) heap rather than stack.
No problem.

The GC strategy for Java works best when the JVM has a lot of memory to play with. That's a big reason why a lot of companies use it for the backends.

However, Java suffers when you start talking about small heaps. This has become a much bigger issue as containered applications have risen as a primary deployment method. There are active efforts ongoing to solve this problem and make Java more friendly to smaller memory footprints and containers in general.

The Go/python/javascript strategies end up working better in those situations. They have very fast startups and pretty low memory requirements. However, when you start talking about apps that need a lot of memory, they both end up suffering as their allocation strategies degrade as the memory being tracked grows. Especially if there's a large amount of memory churn. The JVM has about the best strategy for very high memory churn.

Yeah the Java way makes sense if it's the only thing running on that machine, or at least you know ahead of time how much RAM to budget to each thing. Which was often the case on servers. I'm not surprised if that performs better than Go in a way, but seems like Go does ok. If they really wanted a custom heap on top of preallocated memory in a Go program, couldn't they just do that?

The weirder part is that Java also used to be a bigger thing client-side, back when websites commonly included Java applets.