Which is reference counting, not garbage collection. Ref counts free when count = 0. Garbage collection scans all object pointers and looks for loops / no missing pointers.
Reference counting is not tracing garbage collection. To also quote a Wikipedia Link:
„The main advantage of the reference counting over tracing garbage collection is that objects are reclaimed as soon as they can no longer be referenced, and in an incremental fashion, without long pauses for collection cycles and with clearly defined lifetime of every object.“
Of course reference counting is not tracing garbage collection. I never said it was. The comment I replied to claimed reference counting was not garbage collection at all and seemed to think tracing garbage collection was the only kind of garbage collection. Reference counting and tracing garbage collection are two different types of garbage collection.
binary-trees is almost completely dominated by the time spent in allocator code, and stresses its throughput. This benchmark showcases how big of a gap is between manual per-thread arenas, then tracing generational multi-heap GCs, then ARC and more specialized designs like Go GC. Honorable mention goes to BEAM which also showcases excellent throughput by having process-level independent GCs, in this case resembling the behavior of .NET's and OpenJDK GC implementations.
A tree is indeed a bad fit for RC; so is anything else where you have multiple references to something but know there is a single real owner.
I'd suggest keeping strong references to all the tree nodes in an array, then having everything within the tree be unowned. Basically fake arena allocation.
is a common way you see toy data structures code written, but it's inefficient (because pointer chasing is slow) and there's better patterns. If you use the arena method above, you could use indexes into the arena. If not, intrusive data structures (where the references are inside Node instead of inside Tree) are better.
Pointer chasing is irrelevant here. It takes <= 15% of the execution time, and CPUs have gotten good at it. If it takes more - it speaks more about the quality of the allocator which has poor memory locality. As noted in my previous comment, it is dominated by the time spent in the allocator/GC code.
The requirement to implement the same algorithm with the same data structure is what makes this benchmark interesting and informative. Don't tell me "allocate parent object A, allocate child objects C, D and E and assign them to A's fields, then allocate array F and G, and assign them to D's fields" isn't the bread and butter of all kinds of application logic, something that this benchmark stresses.
Some CPUs are good at it, but most aren't. (Apple's are.)
But that's not the actual issue; the issue is that pointers are big (8 bytes) and indexes are smaller, so now you can fit more in the cache. It would also help GC because it doesn't have to trace them.
Also, I don't recommend intrusive structures merely because they'd be better for language games. I think they're better in general ;)
> But that's not the actual issue; the issue is that pointers are big (8 bytes) and indexes are smaller, so now you can fit more in the cache. It would also help GC because it doesn't have to trace them.
Please read 'binary-trees' description and submission rules (#2). You are missing the point(er).