Hacker News new | ask | show | jobs
by 1000100_1000101 661 days ago
Not the strongest on C++ myself, but the new[] will attempt to run constructors on each element after calling the new operator to get the RAM. The delete[] will attempt to run destructors for each element before calling operator delete[] to free the RAM.

In order for delete[] to work, C++ must track the allocation size somewhere. This could be co-located with the allocation (at ptr - sizeof(size_t) for example), or it could be in some other structure. Using another structure lowers the odds of it getting trampled if/when something writes to memory beyond an object, but comes with a lookup cost, and code to handle this new structure.

I'm sure proper C++ libraries are doing even more, but you already get the idea, new and delete are not the same as malloc and free.

3 comments

> In order for delete[] to work, C++ must track the allocation size somewhere.

That is super-interesting, I had never considered this, but you're absolutely right. I am now incredibly curious how the standard library implementations do this. I've heard normal malloc() sometimes colocates data in similar ways, I wonder if C++ then "doubles up" on that metadata. Or maybe the standard library has it's own entirely custom allocator that doesn't use malloc() at all? I can't imagine that's true, because you'd want to be able to swap system allocators with e.g. LD_PRELOAD (especially for Valgrind and stuff). They could also just be tracking it "to the side" in some hash table or something, but that seems bad for performance.

new[] and delete[] both know the type of the object. Therefore both know whether a destructor needs to be called.

When a destructor doesn't - e.g., new int[] - operator new[] is called upon to allocate N*sizeof(T) bytes. The code stores off no metadata. The result of operator new[] is the array address.

When a destructor does - e.g., new std::string[] - operator new[] is called upon to allocate sizeof(size_t)+N*sizeof(T) bytes. The code stores off the item count in the size_t, adds sizeof(size_t) to the value returned by operator new[], uses that as the address for the array, and calls T() on each item. And delete[] performs the opposite: fishes out the size_t, calls ~T() on each item, subtracts sizeof(size_t) from the array address, and passes that to operator delete[] to free the buffer.

(There are also some additional things to cater for: null checks, alignment, and so on. Just details.)

Note that operator new[] is not given any information about whether a destructor needs to run, or whether there is any metadata being stored off. It just gets called with a byte count. Exercise caution when using placement operator new[], because a preallocated buffer of N*sizeof(T) may not be large enough.

jemalloc and tcmalloc use size classes, so if you allocate 23 bytes the allocator reserves 32 bytes of space on your behalf. Both of them can find the size class of a pointer with simple manipulation of the pointer itself, not with some global hash table. E.g. in tcmalloc the pointer belongs to a "page" and every pointer on that page has the same size.
That doesn’t help for C++ if you allocated an array of objects with destructors. It has to know that you allocated 23 objects, so that it can call 23 destructors, not 32 ones, 9 of which on uninitialized memory.
I believe the question was more around how the program knows how much memory to deallocate. The compiler generates the destructor calls the same way the compiler generates everything else in the program.
Isn't it also possible for other logic to run in a destructor, such as freeing pointers to external resources? Doesn't this cause (at the very least) the possibility for more advanced logic to be run beyond freeing the object's own memory?
Yes, it usually is. See, e.g., smart pointers.
realloc is the same, as the old memory needs to be copied to the new memory.