|
|
|
|
|
by Doradus
4095 days ago
|
|
Well, it's hard to deny that the work done by the program is O(n) (or even Ω(n)) in the number of objects created by the program. That's almost tautological. The thing that is interesting about the asymptotic complexity of garbage collection, though, is that it gives you a peek into the future of GC overhead as heaps get larger. Consider a program with some bounded steady-state memory consumption, and a GC that takes O(1) time to free unlimited amounts of memory. In such a system, you can reduce your GC overhead arbitrarily simply by enlarging the heap. A larger heap gives a larger time interval between collections, yet doesn't make the collections any slower. Don't like 6% overhead? Fine. Make the heap 10x larger and you have 0.6% overhead. This time-space trade-off is always available to the end user of such a system. It's certainly not available with more primitive systems like malloc+free. [These are my personal opinions, not those of my employer.] |
|
So no, I don't see the advantage you say. I see that, in the best case, it's on par with non-GC'd systems, asymptotically speaking.
And yet, I see plenty of disadvantages. To start:
I have yet to see a GC that's O(1) w.r.t. the amount of memory to free. Best case? I'll grant you that, for an extremely naive copying collector. But the chance of that best case occurring is... not significant. Especially with the more sophisticated algorithms.
And again, as above, it doesn't matter how long it takes to free memory. As long as it's linear or sublinear, it'll be dominated, asymptotically speaking, by how long it takes to initialize that memory again. And in steady-state operation it will be reinitialized.
Not to mention the practical effects.
For instance: GC thrashes cache. Badly. It completely kills the concept of a working set. You can't work around that by making your heap larger - if anything it makes it worse. It makes the time period between thrashings longer, yes, but makes them worse when it does happen.
GC also doesn't scale well to large numbers of cores. In particular, your worst case "always" will be when the GC has to check with every core to see if it still has a reference to something. And given that processor numbers are where most of the performance improvements seem to be coming in... You mentioned memory size improvements but didn't mention increasing numbers of cores.
GC also (almost always) has field(s) per object. Minimum of O(1) overhead per object initialized.
Which means that, generally speaking, the overhead doesn't asymptote to zero. It asymptotes to a constant - and non-negligible, at least from what I've seen - amount of overhead.