Hacker News new | ask | show | jobs
by 762236 207 days ago
Why would I want to run a garbage collector and deal with it's performance penalties?
4 comments

Because about 99% of the time the garbage collect is a negligible portion of your runtime at the benefit of a huge dollop of safety.

People really need to stop acting like a garbage collector is some sort of cosmic horror that automatically takes you back to 1980s performance or something. The cases where they are unsuitable are a minority, and a rather small one at that. If you happen to live in that minority, great, but it'd be helpful if those of you in that minority would speak as if you are in the small minority and not propagate the crazy idea that garbage collection comes with massive "performance penalties" unconditionally. They come with conditions, and rather tight conditions nowadays.

I think these threads attract people that write code for performance-critical use cases which explains the "cosmic horror" over pretty benign things. I agree though: most programs aren't going to be brought to their knees over some GC sweeps every so often.
Outside of hobbyist things, performance-critical code is the only responsible use case for a non-memory safe language like C in 2025, so of course it does. (Even that window is rapidly closing, though; languages like Rust and Swift can be better than C for perf-critical things because of the immutability guarantees.)
Productivity, portability, stability, mind-share, direct access to OS APIs... there's a lot of reasons to still use C.
Only if the OS is written in C, and has its APIs exposed as C APIs to userspace.

Quite a few OSes don't fit that rule.

Could you name two of these that are important to you?
I keep hearing this, but I fail to see why "the massive, well-maintained set of critical libraries upon which UNIX is based" is not a good reason to use C in 2025.

I have never seen a language with a better ffi into C than C.

> the massive, well-maintained set of critical libraries upon which UNIX is based

What massive, maintained set is that? Base Unix is tiny, and any serious programming ecosystem has good alternatives for all of it.

> Outside of hobbyist things, performance-critical code is the only responsible use case for a non-memory safe language like C in 2025, so of course it does.

Maybe; I sometimes write non-hobbyist non-performance-critical code in C.

I'm actually planning a new product for 2026 that might be done in C (the current iteration of that product line is in Go, the previous iteration was in Python).

I've few qualms about writing the server in C.

> I've few qualms about writing the server in C.

Bad Unicode support. Lack of cross platform system libraries. Needing to deal with CMake / autotools / whatever. Poor error handling. No built in string, list or map types. No generics. Nullability. No sum types. No option, tuples or multi returns. Generally worse IDE support than a lot of languages. No good 3rd party package ecosystem. The modern idiocy of header files. Memory bugs. Debugging memory corruption bugs. …

I mean, yeah other than all those problems, C is a great little language.

> Bad Unicode support. Lack of cross platform system libraries. Needing to deal with CMake / autotools / whatever. Poor error handling. No built in string, list or map types. No generics. Nullability. No sum types. No option, tuples or multi returns. Generally worse IDE support than a lot of languages. No good 3rd party package ecosystem. The modern idiocy of header files. Memory bugs. Debugging memory corruption bugs. …

You make some good, if oft-repeated, points; but for my product:

1. Bad Unicode support - I'm not sure what I will use this for; glyphs won't be handled by a server program and storage/search of UTF8/codepoints will be handled by the data store (PostgreSQL, if you must know).

2. CMake/autotools/etc - low list of 3rd party dependencies, so a plain Makefile works.

3. Worse IDE support than a lot of languages - not sure what you mean by this. C has LSP support, like every other language. I haven't noticed C support in editors to be worse than other languages.

4. No 3rd party package ecosystem - That's fine, I'm not pulling in many 3rd party packages, so those that are pulled in can be handled with the Makefile and manual updates.

5. The modern idiocy of header files - this confuses me; there is still no good alternative to header files to support exporting to a common ABI. Functions, written in C, will be callable from any other language because header files are automatically handled by swig for FFI.[1]

6. Memory bugs + debugging them - thankfully, using valgrind, then sanitisers in my build/test step makes this a very low priority for me. Not that bugs don't slip through, but single-exit error handling using goto's and cleanups make these kinds of bugs rare. Not impossible, but rare. Having the test steps include valgrind, then various sanitisers reduces the odds even more.

For the rest, yeah, nice to have "No built in string, list or map types. No generics. Nullability. No sum types. No option, tuples or multi returns. ", but those are optional to getting a product out. If C had them I'd use them, but I'm not exactly helpless without them.

The downside of writing a product in C, in 2025, isn't in your list above.

========================================

[1] One of my two main reasons for switching to C is because the product was so useful to paying clients that they'd like more functionality, which includes "use their language of choice to interact with the product.". Thus far I've hacked in solutions depending on which client wanted what, but there's limits to the hacked-in solutions.

IOW, "easily extendable by clients using their language of choice" is a hard product requirement. If it wasn't a hard requirement they can continue using the existing product.

> I've few qualms about writing the server in C.

Why are you not worried about becoming the next Cloudbleed? Do you believe you have superhuman programming abilities?

> Why are you not worried about becoming the next Cloudbleed?

The odds are just too low.

> Do you believe you have superhuman programming abilities?

I do not believe I have superhuman abilities.

I think these threads attract people like that, but also people that want to be like that. I've seen a lot of people do "rigor theater", where things like reproduce-able builds, garbage collection, or, frankly, memory safety are just thought terminating cliches.
> Because about 99% of the time the garbage collect is a negligible portion of your runtime

In a system programming language?

Whether or not GC is a negligible portion of your runtime is a characteristic of your program, not your implementation language. For 99% of programs, probably more, yes.

I have been working in GC languages for the last 25 years. The GC has been a performance problem for me... once. The modal experience for developers is probably zero. Once or twice is not that uncommon. But you shouldn't bend your entire implementation stack choice over "once or twice a career" outcomes.

This is not the only experience for developers, and there are those whose careers are concentrated in the places where it matters... databases, 100%-utilization network code, hardware drivers. But for 99% of the programs out there, whatever language they are implemented in, GC is not an important performance consideration. For the vast bulk of those programs, there is a much larger performance consideration in it that could be turned up in 5 minutes with a profiler and nobody has even bothered to do that and squeeze out the accidentally quadratic code because even that doesn't matter to them, let alone GC delays.

This is the "system programmer's" equivalent of the web dev's "I need a web framework that can push 2,000,000 requests per second" and then choosing the framework that can push 2,001,000 rps over the one that can push 2,000,000 because fast... when the code they are actually writing for the work they are actually doing can barely push 100 rps. Even game engines nowadays have rather quite a lot of GC in them. Even in a system programming language, and even in a program that is going to experience a great deal of load, you are going to have to budget some non-trivial optimization time to your own code before GC is your biggest problem, because the odds that you wrote something slower than the GC without realizing it is pretty high.

> Whether or not GC is a negligible portion of your runtime is a characteristic of your program, not your implementation language.

Of course, but how many developers choose C _because_ it does not have a GC vs developers who choose C# but then work around it with manual memory management and unsafe pointers? ....... It's > 1000 to 1

There are even new languages like C3, Odin, Zig or Jai that have a No-GC-mindset in the design. So why you people insist that deliberately unsafe languages suddenly need a GC? There a other new languages WITH a GC in mind. Like Go. Or pick Rust - no GC but still memory safe. So what's the problem again? Just pick the language you think fits best for a project.

There's plenty of application-level C and C++ code out there that isn't performance-critical, and would benefit from the safety a garbage collector provides.
Right, does `sudo` net benefit from removal of heap corruption, out of bounds, or use after free, etc errors that GC + a few other "safeties" might provide? I think so!
Yes, plenty have been done already so since Lisp Machines, Smalltalk, Interlisp-D, Cedar, Oberon, Sing#, Modula-2+, Modula-3, D, Swift,....

It is a matter to have an open mindset.

Eventually system languages with manual memory management will be done history in agentic driven OSes.

"Agentic driven OSes"? Sounds like AI hype babble.
What does this have to do with system languages with manual memory management going away?
Swift, by design, does not have GC.
Chapter 5,

https://gchandbook.org/contents.html

It would help if all naysayers had their CS skills up to date.

RC is a GC method and the least efficient one.
It's the most predictable and has much less overhead than a moving collector.
> Because about 99% of the time the garbage collect is a negligible portion of your runtime

lol .. reality disagrees with you.

https://people.cs.umass.edu/~emery/pubs/gcvsmalloc.pdf#:~:te...

On page 3 they broadly conclude that if you use FIVE TIMES as much memory as your program would if managed manually, you get a 9% performance hit. If you only use DOUBLE, you get as much as a 70% hit.

Further on, there are comprehensive details on the tradeoffs between style of GC vs memory consumption vs performance.

---

Moving a value from DRAM into a CPU register is an expensive operation, both in terms of latency, and power consumption. Much of the code out in the "real world" is now written in garbage collected languages. Our datacenters are extremely power hungry (as much as 2% of total power in the US is consumed by datacenters), and becoming more so every day. The conclusion here is that garbage collection is fucking expensive, in real-world terms, and we need to stop perpetuating the idea that it's not.

Methodology seems kind of dubious:

> We introduce a novel experimental methodology that lets us quan- tify the performance of precise garbage collection versus explicit memory management. Our system allows us to treat unaltered Java programs as if they used explicit memory management by relying on oracles to insert calls to free. These oracles are generated from profile information gathered in earlier application runs.

What specifically seems dubious about that? I thought it was quite a clever idea.
If you dig into the paper, on page 3 they find out that their null oracle approach (ie without actually freeing the memory) increases run times erratically by 12 to 33%. They then mention that their simulated approach should handle that case but it seems unlikely to me that their stats aren't affected. Also they disable multi-threading – again for repeatability – but that will obviously have a performance impact.
For new projects, I just use Rust: there is zero reason to deal with a garbage collector today. If I'm in C, it's because I care about predictable performance, and why I'm not using Java for that particular project.
I don't understand. This is an optional part of Rust. I'm obviously not using a garbage collector in Rust.
Not really sure what relevance a third party library has when discussing the language characteristics.
The Java stop-the-world garbage collector circa the late 90s/early 2000s traumatized so many people on automated garbage collection.
IDK about Fil-C, but in Java garbage collector actually speeds up memory management compared to C++ if you measure the throughput. The cost of this is increased worst-case latency.

A CLI tool (which most POSIX tools are) would pick throughput over latency any time.

I see this claim all the time without evidence, but it's also apples and oranges. In C++ you can avoid heap allocations so they are rare and large. In java you end up with non stop small heap allocations which is exactly what you try to avoid when you want a program to be fast.

Basically java gc is a solution to a problem that shouldn't exist.

> in Java garbage collector actually speeds up memory management compared to C++ if you measure the throughput

If I had a dollar for every time somebody repeated this without real-world benchmarks to back it up...

I wish I could upvote this 100 times
Java (w/ the JIT warmed up) could possibly be faster than C++, if the C++ program were to allocate every single value on the heap.

But you're never going to encounter a C++ program that does that, since it makes no sense.

Entertaining anecdote:

I once worked on a python program that was transpiled to C++, and literally every variable was heap allocated (because that's what python does). It was still on the order of 200x faster than python IIRC.

You also pay for the increased throughput with significant memory overhead, in addition to worst-case latency.
This. The memory overhead kills you in large systems/OS-level GC. Reducing the working set size really matters in a complex system to keep things performant, and GC vastly expands the working set.

In the best cases, you’re losing a huge amount of performance vs. an equivalent non-GC system. In the worst, it affects interactive UI performance with multi-second stalls (a suitably modern GC shouldn’t do this, though).

Depending on the CLI tool you could even forego memory management completely and just rely on the OS to clean up. If your program completely reads arbitrary files into memory it's probably not the best idea, but otherwise it can be a valid option. This is likely at least partly what happens when you run a benchmark like this - the C++ one cleans everything up nicely if you use smart pointers or manual memory management, while the Java tool doesn't even get to run GC at all, or if it does it only cleans up a percentage of the objects instead of all of them.
Because C is very unsafe, but there are still many billions of lines of C in use, so making C safer is a great idea.
Easy: because in your specific use-case, it's worth trading some performance for the added safety.
If I'm in C, I'm using JNI to work around the garbage collector of Kava
Have you ever measured the performance impact of JNI? :-)