|
|
|
|
|
by pron
292 days ago
|
|
> It turns out that using more RAM costs you more CPU Yes, memory bandwidth adds another layer of complication, but it doesn't matter so much once your live set is much larger than your L3 cache. I.e. a 200MB live set and a 100GB live set are likely to require the same bandwidth. Add to that the fact that tracing GCs' compaction can also help (with prefetching) and the situation isn't so clear. > That's the kind of problem that GC was really meant for. Given the huge strides in tracing GCs over the past ten and even five years, and their incredible performance today, I don't think it matters what those of 40+ years ago were meant for, but I agree there are still some workloads - not anything that isn't spaghetti-like, but specifically arenas - that are more efficient than tracing GCs (young-gen works a little like an arena but not quite), which is why GCs are now turning their attention to that kind of workload, too. The point remains that it's very useful to have a memory management approach that can turn the RAM you've already paid for to reduce CPU consumption. Indeed, we're not seeing any kind of abandonment of tracing GC at a rate that is even close to suggesting some significant economic value in abandoning them (outside of very RAM-constrained hardware, at least). |
|
That approach is specifically arenas: if you can put useful bounds on the maximum size of your "dead" data, it can pay to allocate everything in an arena and free it all in one go. This saves you the memory traffic of both manual management and tracing GC. But coming up with such bounds involves manual choices, of course.
It goes without saying that memory compaction involves a whole lot of extra traffic on the memory subsystem, so it's unlikely to help when memory bandwidth is the key bottleneck. Your claim that a 200MB working set is probably the same as a 100GB working set (or, for that matter, a 500MB or 1GB working set, which is more in the ballpark of real-world comparisons) when it comes to how it's impacted by the memory bottleneck is one that I have some trouble understanding also - especially since you've been arguing for using up more memory for the exact same workload.
Your broader claim wrt. memory makes a whole lot of sense in the context of how to tune an existing tracing GC when that's a forced choice anyway (which, AIUI, is also what the talk is about!) but it just doesn't seem all that relevant to the merits of tracing GC vs. manual memory management.
> we're not seeing any kind of abandonment of tracing GC at a rate that is even close to suggesting some significant economic value in abandoning them
We're certainly seeing a lot of "economic value" being put on modern concurrent GC's that can at least perform tolerably well even without a lot of memory headroom. That's how the Golang GC works, after all.