Hacker News new | ask | show | jobs
by chrisseaton 3535 days ago
I agree with you, but in this

> std::string x = "Hello, " + fullname; // cpu cycles spent on GC

you are already spending cycles on memory management (if C++ allocates character data on the heap which I think it does). You are searching for free space in the heap, possibly synchronising to do that, and so on.

With a GC you may even use less cycles here! For example a copying GC could mean that you can allocate with a simple thread local bump pointer.

So in your statement you are already paying an unknown cycle cost for memory management. Why do you care if it's GC?

Your answer is probably the variance in the number of cycles - the noticeable pauses - which is a reasonable concern.

5 comments

Yes, but in this case we know when the allocations will occur, and when they will be freed. If using a GC we know when they will occur, but do not know when they will be freed. Which means that at some indeterminate point in the future there will be a large temporary slowdown due to processing the GC.

This is one of the bigger reasons people use C++ and even techniques within it to explicitly collect such items at a known point in time. (Techniques such as marking items as dead in an array but still keeping them in there until the end of frame, etc)

> If using a GC we know when they will occur, but do not know when they will be freed. Which means that at some indeterminate point in the future there will be a large temporary slowdown due to processing the GC.

This just isn't true anymore. Incremental collectors can achieve pause times in the single-digit millisecond range, and concurrent collectors can achieve pause times in the single-digit to tens of microseconds range, even for super-high-allocation rate programs. There are even real-time collectors suitable for audio and other hard real-time applications.

Azul GC (a high performance concurrent compacting collector): https://www.azul.com/products/zing/pgc/

Metronome (a real-time GC with guaranteed maximum pause times): http://researcher.watson.ibm.com/researcher/view_group_subpa...

GC scheduling in V8 (hides GC pause times between frames, reducing animation jank): http://queue.acm.org/detail.cfm?id=2977741

The Go GC ships now with a very low-latency GC: https://blog.golang.org/go15gc

> This just isn't true anymore. Incremental collectors can achieve pause times in the single-digit millisecond range

Single digit milliseconds is millions of instructions, which /is/ a large slowdown in some applications.

How much do you think a page fault costs?
Page faults cost zero. On locked pages.
A millisecond is a huge amount of time, and in that time we can do so many more things more useful to the application than just collecting its trash.
Again downvotes. It'd be nice if people actually replied instead of downvoting.
This is the advantage of C++ for me. Deterministic behaviour.
But is it deterministic? Will the allocator always have a deterministic amount of work to do to find enough free space for your string characters? I'm not sure that's the case.
Which allocator? For those who care, they replace the allocator.
You can deactivate GC momentarily and reclaim memory when you want (end of frame, every ten frames, ...). Most of the time you can manage to write your code to minimize allocation, or make sure memory is allocated on the stack. Depending on your parameters and a little bit of profiling, you can manage to have a stable usage of memory over time and a bounded GC time.
malloc/free/etc. will only cost when they're called; GC can be a continuous expense even when there's no collection.
It is a matter how it is implemented.
Most GCs will not cost you a single cycle if you never allocate.
This is technically correct (though in most gc's, if you allocate and keep a single byte, you pay for it with various barriers, etc, forever) but then, because they have good GC's that are like this, almost every GC language used allocates all the time.

So it would be more accurate to say "Most GC's will not cost you a single cycle if you and the underlying language runtime and execution environment do not allocate".

IE your statement, while true in theory, makes literally no practical difference if allocations happen all the time without you knowing or controlling it.

But in most GC languages there is nothing you can do without allocating. Creating an object is already allocating it on the heap, printing a string will also allocate.
Not in GC languages that also have value types.

For example you can do all of that in Modula-3, Active Oberon, Component Pascal and many others without allocating more than in C for example.

Mixing all GC enabled languages in the same box is a mistake.

I don't know why they downvote you. Even in Java with no value types yet, there are ways to write useful code with no or almost no managed heap allocation. And if you don't need ultra low pauses, mixing these techniques, e.g. using managed heap for 80% of non-performance critical stuff and using careful manual dynamic allocation for the rest 20% (e.g. large data buffers) typically gets you all good things at once: high throughput, low pauses, almost no GC activity when it matters and convenience of writing most of the code.
I just recently wrote my own memory allocator for my own String. Now my String is at least 2x faster then the next fastest alternative (I have tried many alternatives for C++ strings including std::string of course). String allocation can be made very fast with thread local memory pools (and you just need a basic GC to free up memory if there are a lot of strings allocated in one thread but destroyed on another one).
It is very likely that in this particular case small string optimization would allocate data on stack and no heap allocations would take place.
> For example a copying GC could mean that you can allocate with a simple thread local bump pointer.

This is equally true for explicit memory allocation. The point is that on some allocations under GC, it will have to collect garbage. And collecting garbage will tend to be more expensive than explicit frees, because it usually has to do work to discover what is garbage.