Hacker News new | ask | show | jobs
by jim33442 72 days ago
I still don't get why Java is the only language that needs the heap to be carefully tuned. Like it hogs some memory at start, crashes if you go above a certain amount, and doesn't return memory to the OS when GC'd. Even Python and JS don't have those problems.
3 comments

> I still don't get why Java is the only language that needs the heap to be carefully tuned.

Only tuning you should be doing is setting the heap size and algorithm (Though, size is likely enough).

> Like it hogs some memory at start, crashes if you go above a certain amount, and doesn't return memory to the OS when GC'd. Even Python and JS don't have those problems.

Unlike Python and I believe most javascript engines, the JVM uses moving garbage collectors. That's the primary reason why it hogs memory.

In these ready to return to OS languages, when something is freed or allocated they are literally calling "malloc" and "free" directly. That's why stuff tends to return back to the OS faster.

The JVM doesn't do that. When a GC runs in the JVM, the JVM picks up and moves live data to a new location. That means the JVM needs a minimum amount of free space to operate. The benefit of this is the JVM can allocate really fast, it's just a single pointer bump and a check to ensure there's enough space. It's pretty close to the same performance as the stack is in C++.

And if there's a lot of data that lives for a short period of time, the JVM frees that data very fast as well. There is little accounting that the JVM has to do to free stuff up because it's simply moving the live data.

For even the fastest allocators that python/javascript engines use, this isn't true. They have to keep track of the various allocation locations and the gaps in allocation when something is freed. And a request for allocation needs to ultimately find a location in the heap with enough room.

Java does have a memory issue, though, all objects in java are pretty bulky. This will hopefully be fixed in future versions when "value" types are added.

Thanks, that's a really good explanation. And makes sense, especially since Java objects are all constantly getting allocated/freed on the (virtual) heap rather than stack.
No problem.

The GC strategy for Java works best when the JVM has a lot of memory to play with. That's a big reason why a lot of companies use it for the backends.

However, Java suffers when you start talking about small heaps. This has become a much bigger issue as containered applications have risen as a primary deployment method. There are active efforts ongoing to solve this problem and make Java more friendly to smaller memory footprints and containers in general.

The Go/python/javascript strategies end up working better in those situations. They have very fast startups and pretty low memory requirements. However, when you start talking about apps that need a lot of memory, they both end up suffering as their allocation strategies degrade as the memory being tracked grows. Especially if there's a large amount of memory churn. The JVM has about the best strategy for very high memory churn.

Yeah the Java way makes sense if it's the only thing running on that machine, or at least you know ahead of time how much RAM to budget to each thing. Which was often the case on servers. I'm not surprised if that performs better than Go in a way, but seems like Go does ok. If they really wanted a custom heap on top of preallocated memory in a Go program, couldn't they just do that?

The weirder part is that Java also used to be a bigger thing client-side, back when websites commonly included Java applets.

You don't have to do anything, it just works.

But eventually someone gonna write the goddamn whole of AWS or Alibaba in the language where you have machines with TBs of heaps (yeah, you read that right), where most other managed languages would just give up instantly - and then you may have to add 2-3 parameters to make it actually run properly in these extreme conditions.

They do, but then people either work around them, or rewrite in Rust.

Java has all these knobs, because the ultimate goal is not needed to rewrite, rather fine tuning, just like when you look at the endless command line options for GCC, clang, MSVC,...

It is also a matter of implementation, Android is Java (kind of), and you also don't get push knobs unless you are a developer talking directly to a single device over ADB.