Hacker News new | ask | show | jobs
by Ygg2 293 days ago
> It isn't optional, and yet it's also not at any cost, or we'd all be programming in ATS/Idris.

In a better, saner world, we'd writing Ada++ not C++. However, we don't live in a perfect world.

> The goal isn't to write the most correct program; it's to write the most correct program under the project's budget and time constraints.

The goal of ANY software engineer worth their salt should be minimizing errors and defects in their end product.

This goal can be reached by learning to write Rust; practice makes perfect.

If GC is acceptable or you need lower compilation times, then yes, go and write your code in C#, Java, or JavaScript.

2 comments

> In a better, saner world, we'd writing Ada++ not C++.

As someone who worked on safety-critical air-traffic-control software in the nineties, I can tell you that our reasons for shifting to C++ were completely sane. Ada had some correctness advantages compared to C++, but also disadvantages. It had drastically slower build times, which meant we couldn't test the software as frequently, and the language was very complicated that we had to spend more time digging into the minutiae of the language and less time thinking about the algorithm (C++ was simpler back then than it is now). When Java became good enough, we switched to Java.

Build times and language complexity are important for correctness, and because of them, we were able to get better correctness with C++ than with Ada. I'm not saying this is universal and always the case, but the point is that correctness is impacted by many factors, and different projects may find achieving higher correctness in different ways. Trading off fewer use-after-free for longer build times and a more complex language may be a good tradeoff for the correctness of some projects, and a bad tradeoff for others.

> If GC is acceptable or you

BTW, a tracing GC - whose costs are now virtually entirely limited to a higher RAM footprint - is acceptable much more frequently than you may think. Sometimes, without being aware, languages like C, C++, Rust, or Zig may sacrifice CPU to reduce footprint, even when this tradeoff doesn't make sense. I would strongly recommend watching this talk (from the 2025 International Symposium on Memory Management), and the following Q&A about the CPU/footprint tradeoff in memory management: https://www.youtube.com/watch?v=mLNFVNXbw7I

> a tracing GC is acceptable much more frequently than you may think

Two things here, the easy one first: The RAM trade off is excellent for normal sizes but if you scale enormously the trade off eventually reverses. $100 per month for the outfit's cloud servers to have more RAM is a bargain compared to the eye-watering price of a Rust developer. But $1M per month when you own whole data centres in multiple countries makes hiring a Rust dev look attractive after all.

As I understand it this one is why Microsoft are rewriting Office backend stuff in Rust after writing it originally in C#. They need a safe language, so C++ was never an option, but at their scale higher resource costs add up quickly so a rewrite to a safe non-GC language, though not free, makes economic sense.

Second thing though: Unpredictability. GC means you can't be sure when reclamation happens. So that means both latency spikes (nasty in some applications) and weird bugs around delayed reclamation, things running out though you aren't using too many at once, etc. If your only resource is RAM you can just buy more, but otherwise you either have to compromise the language (e.g. a defer type mechanism, Python's "with" construct) or just suck it up. Java tried to fix this and gave up.

I agree that often you can pick GC. Personally I don't much like GC but it is effective and I can't disagree there. However you might have overestimated just how often, or missed more edge cases where it isn't a good choice.

> The RAM trade off is excellent for normal sizes but if you scale enormously the trade off eventually reverses

I don't think you've watched the talk. The minimal RAM-per-core is quite high, and often sits there unused even though it could be used to reduce the usage of the more expensive CPU. You pay for RAM that you could use to reduce CPU utilisation and then don't use it. What you want to aim for is a RAM/CPU usage that matches the RAM/CPU ratio on the machine, as that's what you pay for. Doubling the CPU often doubles your cost, but doubling RAM costs much less than that (5-15%).

If two implementations of an algorithm use different amounts of memory (assuming they're reasonable implementations), then the one using less memory has to use more CPU (e.g. it could be compressing the memory or freeing and reusing it more frequently). Using more CPU to save on memory that you've already paid for is just wasteful.

Another way to think about it is consider the extreme case (although it works for any interim value) where a program, say a short-running one, uses 100% of the CPU. While that program runs, no other program can use the machine, anyway, so if you don't use up to 100% of the machine's RAM to reduce the program's duration, then you're wasting it.

As the talk says, it's hard to find less than 1GB per core, so if a program uses computational resources that correspond to a full core yet uses less than 1GB, it's wasteful in the sense that it's spending more of a more expensive resource to save on a less expensive one. The same applies if it uses 50% of a core and less than 500MB of RAM.

Of course, if you're looking at kernels or drivers or VMs or some sorts of agents - things that are effectively pure overhead (rather than direct business value) - then their economics could be different.

> Second thing though: Unpredictability. GC means you can't be sure when reclamation happens.

What you say may have been true with older generations of GCs (or even something like Go's GC, which is basically Java's old CMS, recently removed after two newer GC generations). OpenJDK's current GCs, like ZGC, do zero work in stop-the-world pauses. Their work is more evenly spread out and predictable, and even their latency is more predictable than what you'd get with something like Rust's reference-counting GC. C#'s GC isn't that stellar either, but most important server-side software uses Java, anyway.

The one area where manual memory management still beats the efficiency of a modern tracing GC (although maybe not for long) is when there's a very regular memory usage pattern through the use of arenas, which is another reason why I find Zig particularly interesting - it's most powerful where modern GCs are weakest.

By design or happy accident, Zig is very focused on where the problems are: the biggest security issue for low-level languages is out-of-bounds access, and Zig focuses on that; the biggest shortcoming of modern tracing GCs is arena-like memory usage, and Zig focuses on that. When it comes to the importance of UAF, compilation times, and language complexity, I think the jury is still out, and Rust and Zig obviously make very different tradeoffs here. Zig's bottom-line impact, like that of Rust, may still be too low for widespread adoption, but at least I find it more interesting.

> As I understand it this one is why Microsoft are rewriting Office backend stuff in Rust after writing it originally in C#

The rate at which MS is doing that is nowhere near where it would be if there were some significant economic value. You can compare that to the rate of adoption of other programming languages or even techniques like unit testing or code review. With any new product, you can expect some noise and experimentation, but the adoption of products that offer a big economic value is usually very, very fast, even in programming.

> What you want to aim for is a RAM/CPU usage that matches the RAM/CPU ratio on the machine, as that's what you pay for.

This totally ignores the role of memory bandwidth, which is often the key bottleneck on multicore workloads. It turns out that using more RAM costs you more CPU, too, because the CPU time is being wasted waiting for DRAM transfers. Manual memory management (augmented with optional reference counting and "borrowed" references - not the pervasive refcounting of Swift, which performs less well than modern tracing GC) still wins unless you're dealing with the messy kind of workload where your reference graphs are totally unpredictable and spaghetti-like. That's the kind of problem that GC was really meant for. It's no coincidence that tracing GC was originally developed in combination with LISP, the language of graph-intensive GOFAI.

> It turns out that using more RAM costs you more CPU

Yes, memory bandwidth adds another layer of complication, but it doesn't matter so much once your live set is much larger than your L3 cache. I.e. a 200MB live set and a 100GB live set are likely to require the same bandwidth. Add to that the fact that tracing GCs' compaction can also help (with prefetching) and the situation isn't so clear.

> That's the kind of problem that GC was really meant for.

Given the huge strides in tracing GCs over the past ten and even five years, and their incredible performance today, I don't think it matters what those of 40+ years ago were meant for, but I agree there are still some workloads - not anything that isn't spaghetti-like, but specifically arenas - that are more efficient than tracing GCs (young-gen works a little like an arena but not quite), which is why GCs are now turning their attention to that kind of workload, too. The point remains that it's very useful to have a memory management approach that can turn the RAM you've already paid for to reduce CPU consumption.

Indeed, we're not seeing any kind of abandonment of tracing GC at a rate that is even close to suggesting some significant economic value in abandoning them (outside of very RAM-constrained hardware, at least).

> The point remains that it's very useful to have a memory management approach that can turn the RAM you've already paid for to reduce CPU consumption.

That approach is specifically arenas: if you can put useful bounds on the maximum size of your "dead" data, it can pay to allocate everything in an arena and free it all in one go. This saves you the memory traffic of both manual management and tracing GC. But coming up with such bounds involves manual choices, of course.

It goes without saying that memory compaction involves a whole lot of extra traffic on the memory subsystem, so it's unlikely to help when memory bandwidth is the key bottleneck. Your claim that a 200MB working set is probably the same as a 100GB working set (or, for that matter, a 500MB or 1GB working set, which is more in the ballpark of real-world comparisons) when it comes to how it's impacted by the memory bottleneck is one that I have some trouble understanding also - especially since you've been arguing for using up more memory for the exact same workload.

Your broader claim wrt. memory makes a whole lot of sense in the context of how to tune an existing tracing GC when that's a forced choice anyway (which, AIUI, is also what the talk is about!) but it just doesn't seem all that relevant to the merits of tracing GC vs. manual memory management.

> we're not seeing any kind of abandonment of tracing GC at a rate that is even close to suggesting some significant economic value in abandoning them

We're certainly seeing a lot of "economic value" being put on modern concurrent GC's that can at least perform tolerably well even without a lot of memory headroom. That's how the Golang GC works, after all.

> (outside of very RAM-constrained hardware, at least)

I've spent much of my career working on desktop software, especially on Windows, and especially programs that run continuously in the background. I've become convinced that it's my responsibility to treat my user's machines as RAM-constrained, and, outside of any long-running compute-heavy loops, to value RAM over CPU as long as the program has no noticeable lag. My latest desktop product was Electron-based, and I think it's pretty light as Electron apps go, but I wish I'd had the luxury of writing it all in Rust so it could be as light as possible (at least one background service is in Rust). My next planned desktop project will be in Rust.

A recent anecdote has reinforced my conviction on this. One of my employees has a PC with 16 GB of RAM, and he couldn't run a VMware VM with 4 GB of guest RAM on that machine. My old laptop, also with 16 GB of RAM, had plenty of room for two such VMs. I didn't dig into this with him, but I'm guessing that his machine is infested with crap software, much of which is probably using Electron these days, each program assuming it can use as much RAM as it wants. I want to fight that trend.

That talk is mainly a GC person assuring you and perhaps themselves that all this churn is actually desirable. While "Actually this was a terrible idea and I regret it" is a topic sometimes (e.g. Tony Hoare has done this more than once) the vast majority of such talks exist to assure us that the speaker was correct, or at worst that they made some brief mistake and have now corrected it. So there's nothing unexpected here, I would not expect something else from the GC maintainer.

The part you've mistaken for being somehow general and relevant to our conversation is about trade-offs within GC. So it's completely irrelevant rather than being, as you seemed to imagine, an important insight. It actually reminds me of the early 1940s British feedback on intelligence for the V2 rocket. British and American Scientists were quite sure that Germany could not develop such a rocket, toy rockets work but at scale this rocket cannot work.

You can buy those toys today, a model store or similar will sell you the basics so you can see for yourself, and indeed if you scale that up it won't make an effective weapon. However intelligence sources eventually revealed how the German V2 was actually fuelled. To us today having seen space rockets it's obvious, the fuel is a liquid not a solid like the toy, which makes fuel loading rather difficult but delivers enormously more thrust. The experts furiously recalculated and discovered that of course these rockets would work unlike the scaled up toy they'd been assessing before. The weapon was very real.

Anyway. The speaker is assuming that we're discussing how much RAM to use on garbage. Because they're assuming a garbage collector, because this is a talk about GC. But a language like Rust isn't using any RAM for this. "The fuel is liquid" isn't one of the options they're looking at, it's not what their talk is even about so of course they don't cover it.

> With any new product, you can expect some noise and experimentation, but the adoption of products that offer a big economic value is usually very, very fast, even in programming.

What you've got here is the perfect market fallacy. This is very, very silly. If you're young enough to actually believe it due to lack of first hand experience then you're going to have a rude awakening, but I think sadly it probably just means you don't like the reality here and are trying to explain it away.

> That talk is mainly a GC person assuring you and perhaps themselves that all this churn is actually desirable. While "Actually this was a terrible idea and I regret it" ...

The speaker is one of the world's leading experts on memory management, and the "mistake" is one of the biggest breakthroughs in software history, which is, today, the leading chosen memory management approach. Tony Hoare has done this when his mistakes became apparent; it's hard to find people who say "it was a terrible idea and I regret it" when the idea has won spectacularly and is widely recognised to be quite good.

> Anyway. The speaker is assuming that we're discussing how much RAM to use on garbage. Because they're assuming a garbage collector, because this is a talk about GC. But a language like Rust isn't using any RAM for this. "The fuel is liquid" isn't one of the options they're looking at, it's not what their talk is even about so of course they don't cover it.

Hmm, so you didn't really understand the talk, then. You can reduce the amount of garbage to zero, and the point still holds. To consume less RAM - by whatever means - you have spend CPU. After programming in C++ for about 25 years, this is obvious to me, as it should be to anyone who does low-level programming.

The point is that a tracing-moving algorithm can use otherwise unused RAM to reduce the amount of CPU spent on memory management. And yes, you're right, usually in languages like C++, Zig, or Rust we typically do not use extra RAM to reduce the amount of CPU we spend on memory management, but that doesn't mean that we couldn't or shouldn't.

> What you've got here is the perfect market fallacy.

A market failure/fallacy isn't something you can say to justify any opinion you have that isn't supported by empirical economics. A claim that the market is irrational may well be true, but it is recognised, even by those who make it, as something that requires a lot of evidence. Saying that you know what the most important thing is and that you know the best way to get it, and then use the fact that most experts and practitioners don't support your position as evidence that it is correct. That's the Galileo complex: what proves I'm right is that people say I'm wrong. Anyway, a market failure isn't something that's merely stated; it's something that's demonstrated.

BTW, one of those times Tony Hoare said he wrong? It was when he claimed the industry doesn't value correctness or won't be able to achieve it without formal proofs. One of the lesssons from that in the software correctness community is to stop claiming or believing we have found the universally best path to correctness, and that's why we stopped doing that in the nineties. Today it's well accepted there can be many effective paths to correctness, and the research is more varied as well.

I started programming in 1988-9, I think, and there's been a clear improvement in quality even since then (despite the growth in size and complexity of software). Rust makes me nostalgic because it looks and feels and behaves like a 1980s programming language - ML meets C++ - and I get its retro charm (and Zig has it, too, in its Sceme meets C sort of way), but we've learnt, what Tony Hoare has learnt, is that there are many valid approaces to correctness. Rust offers one approach, which may be attractive to some, but there are others that may be attractive to more.

> You can reduce the amount of garbage to zero, and the point still holds.

Nah, in the model you're imagining now the program takes infinite time. But we can observe that our garbage free Rust program doesn't take infinite time, it's actually very fast. That's because your model is of a GC system - where ensuring no garbage really would need infinite time and a language without GC isn't a GC with zero garbage, it's entirely different, that's the whole point.

More generally, a GC-less solution may be more CPU intensive, or it may not, and although there are rules of thumb it's difficult to have any general conclusions. If you work on Java this is irrelevant, your language requires a GC, so this isn't even a question and thus isn't worth mentioning in a talk about, again, the Java GC.

> A claim that the market is irrational may well be true

Which makes your entire thrust stupid. You depend upon the perfect market fallacy for your point there, the claim that if this was a good idea people would necessarily already be doing it - once you accept that's a fallacy you have nothing.

> The goal of ANY software engineer worth their salt should be minimizing errors and defects in their end product.

...to the extent possible within their project budget. Otherwise the product would — as GP already pointed out — not exist at all, because the project wouldn't be undertaken in the first place.

> This goal can be reached by learning to write Rust; practice makes perfect.

Pretty sure it could (at least) equally well be reached by learning to write Ada.

This one-note Rust cult is really getting rather tiresome.

> The goal of ANY software engineer worth their salt should be minimizing errors and defects in their end product. > > ...to the extent possible within their project budget.

Sure, but when other engineers discover that shit caused us many defects (e.g., asbestos as a fire insulator), they don't turn around and say, "Well, asbestos sure did cause a lot of cancer, but Cellulose Fibre doesn't shield us from neutron radiation. So it won't be preventing all cancers. Ergo, we are going back to asbestos."

And then you have team Asbestos and team Lead paint quarrelling who has more uses.

That's my biggest problem. The cyclic, Fad Driven Development that permeates software engineering.

> Pretty sure it could (at least) equally well be reached by learning to write Ada.

Not really. Ada isn't that memory safe. It mostly relies on runtime checking [1]. You need to use formal proofs with Ada SPARK to actually get memory safety on par with Rust.

> Pretty sure it could (at least) equally well be reached by learning to write Ada.

See above. You need Ada with Spark. At that point you get two files for each method like .c/.h, one with method definition and one with proof. For example:

    // increment.ads - the proof
    procedure Increment
        (X : in out Integer)
    with
      Global  => null,
      Depends => (X => X),
      Pre     => X < Integer'Last,
      Post    => X = X'Old + 1;

    // increment.adb - the program
    procedure Increment
      (X : in out Integer)
    is
    begin
      X := X + 1;
    end Increment;
But you're way past what you call programming, and are now entering proof theory.

[1] https://ajxs.me/blog/How_Does_Adas_Memory_Safety_Compare_Aga...