Hacker News new | ask | show | jobs
by dmpk2k 1181 days ago
The post glosses over the most important part of Erlang's GC: it collects process heaps separately. This transforms a hard problem (collecting a global heap with low latency despite concurrent mutators) to a _much_ simpler problem, at the price of more copying. Compare Java's G1 with Erlang's GC; the former hurts my head.

For those problems that are amenable to Erlang's model, this is a fine solution. The only real improvement here would be making collection incremental.

7 comments

Erlang also has reference counters for things like strings that are immutable and can be shared between threads (processes in Erlang).

Overall this is a good model. Use GC for small per green thread heaps. Then use reference counters for shared immutable structures that cannot form cycles and copy everything else.

Erlang only uses reference counting for binaries larger than 64 bytes, everything else is allocated on the process heap (or in heap fragments) and copied. Just that is enough to have a beneficial effect though, since large binaries are relatively common in practice, and are frequently passed around from process-to-process.
And if you have any other global state you want to pass around, you can pull off a clever trick by passing it around as a binary and then unpacking it as needed within caller processes.

AFAIK this trick is why BEAM files use an IFF-derived format (easy to parse individual chunks out at runtime), and why erlang:module_info/{1,2} are the way they are: working with module metadata literally just means asking the code-server process for the (shared refcounted) module binary, and then parsing it yourself.

I thought Erlang's Garbage collector was incremental by virtue of being per process. A system may have tens of thousands of processes, using a gigabyte of memory overall, but if GC occurs in a process with a 20K heap, then the collector only touches that 20K and collection time is imperceptible. With lots of small processes, you can think of this as a truly incremental collector.

It's not incremental per process, but I'm not sure it would even matter that much in practice.

Yes, that is how it works, except (as you implicitly note) that large heaps in single processes can cause problems; allowing incremental collection per heap would flatten the latency profile further.
Large GC jobs get scheduled on dirty schedulers today (a background thread pool), since it's not OK to block a normal scheduler more than 1ms or so in Erlang. If they could be split into smaller chunks of work, perhaps it could be done on normal schedulers, making time allocation more fair.
Another point is that due to erlang's immutability there cannot be pointers from oldgen into nursery and thus the GC does not need write barriers.
Wouldn't Erlang be much more efficient if it simply compiled to the JVM?
Almost 10 years ago, i've tested erjang [1] using a medium sized application. Throughput was better than BEAM but latency was terrible.

[1] https://github.com/trifork/erjang/

Ten years ago was two whole technological generations ago in the implementation of OpenJDK's GCs. OpenJDK now has a maximum pause time of under 1ms for heaps up to 16TB.
I really strongly doubt that GC is a bottleneck for Erlang programs on either the BEAM or the JVM - the sophistication of the scheduler, and the way various language primitives interact with it, is where the BEAM is almost certainly gaining an edge over the JVM. That said, I'm sure there are a subset of programs that _would_ be faster on the JVM, just depends on what metrics are being compared.
> the sophistication of the scheduler, and the way various language primitives interact with it

That was brought over to the JDK six months ago. The JDK can now spawn millions of Erlang-like processes ("virtual threads") per second.

Erlang is a great inspiration and it does incredibly well with the development resources available to it, but it's hard to compete with the level of engineering investment in the JDK and its state-of-the-art GCs, optimising JIT compilers, and low-overhead in-production tracing and profiling.

The BEAM is so much more than just a green thread runtime - and it is not straightforward at all to just "bring it over" to another virtual machine, the entire BEAM VM is designed around the scheduler, and core features such as signals (used most notably for links/monitors, but also for a number of other system features) and messaging, are deeply integrated with the scheduler; as are system tasks, execution of NIFs (natively-implemented functions), and more. Furthermore, the schedulers are adaptive to the amount of work in the system, and the behavior of the other schedulers, i.e. they aren't just simple work-stealing queues for green threads. Over 20 years of engineering effort have been invested in this system, and it is highly optimized for the types of workloads that Erlang is used for.

In my opinion, trying to replicate ERTS on top of another virtual machine either requires making that virtual machine more like the BEAM, or will end up always playing catch up with the BEAM itself in some aspect. That's just my two cents though.

BEAM doesn't support mutation, the JVM does. that's a big plus in the BEAMs corner.

Also when two BEAM VM speak to each other its as if they are just one machine. no serialization de serialization of data, no protobufs, etc. It's all just primitives being sent between processes regardless of their location.

Great, so you guys can spend the next 30 years writing OTP to work seamlessly and flawlessly on top of the new JVM with the same (memory, security, reliability, fault tolerance, etc) guarantees OTP has on the BEAM VM today :)
As the other reply noted, I’d be shocked if it wouldn’t be much better now, seeing something like graal being used would be really interesting. I think if Elixir could target beam or jvm it would be an amazing language for many tasks.
> it would be an amazing language

No, it wouldn't. Elixir is getting really fast computation through, e.g. nx, and the user story is incredible (OS install to stable diffusion in 40 minutes, most of which is dicking around figuring out how to install CUDA). Is it easy to run stable diffusion on jvm?

I think you might be taking my comment as implying the jvm is better than beam but that isn’t the argument I’m making. Having a strong jvm option means you can cut through a ton of corporate red tape. I don’t need to convince some skeptical CTO if the jvm is reasonable. It’s makes choosing elixir for a project feel about as hard a change as using clojure or groovy.
Yeah and not too many ctos would choose clojure or groovy either, and the ones that are, aren't oking it "because it's on the jvm".
JVM standard does not support isolates so it won't work. Java's father Gosling wanted to get isolation into the Java spec but he failed.

The modern GraalVM does have isolates but its a VM specific feature and not a java standard feature.

It likely would. But efficiency is only one factor. Many Erlang applications are far more concerned with consistent latency than throughput efficiency. So a switch to the JVM is a lot of cost.
Are these folks also running their software on a real-time OS?
Erlang can be run bare metal.
You can take a look to the interview with Francesco Cesarini https://www.youtube.com/watch?v=-m31ag9z4VY for more details - here is provided a part where compared JVM with a BEAM.
Sure on a single machine, perhaps. but once you have multiple machines, the JVM would have to do what the BEAM does today; copy messages between processes regardless of location. That's going to slow down throughput.
No.
Ha! Absolutely no
Why not? JVM has a highly optimized concurrent GC.
Erlang has a highly optimized concurrent GC as well. It's just optimized for different things. And maybe the concurrency of the GC is different; Erlang has one heap per process (aka green thread), and no concurrency within a heap.

Erlang GC is also very simple and easy to understand because language features only allow references in one direction. Much of JVM GC complexity would be wasted as there's no need to look for reference loops and such, since they're not possible.

It’s a GC tuned for imperative languages that prefer mutation over allocation, which is the exact inverse of what BEAM needs.
OpenJDK's GCs do have elaborate mechanisms to support mutation, but if they're unused they impose no extra overhead.
I think you may be missing the point. What the JVM doesn't have is elaborate optimizations to support copies, e.g. a generational GC.
This is a good point, thanks! I will extend the topic or maybe will be better to provide new topic as continuation of the current topic - since putting everything in one article can be difficult to understand and will increase the article itself, making it more difficult to read.
> the price of more copying.

More copying if you pass values between processes. Honestly it would be really cool if you could mark off certain values that you know you're going to pass around and put them in a heap like the global binary heap.

there are lots of foot guns for the user with this model. because transferring data between processes involves copying this can become a problem. Erlang tries to optimise the handling of large binaries by using a separate reference counted heap. however, this introduces another set of issues where memory is 'leaked' because a smaller binary is holding a reference to a larger binary or because processes that have not been GC'd have not decremented the ref count of large binaries in the heap that they no longer user.
You literally listed the two biggest footguns and claimed there are "lots" of footguns. That really is it.