Hacker News new | ask | show | jobs
by matheusmoreira 891 days ago
Is there a rationale for a memory allocator to support zero sized allocations? Is this really just about providing a "technically" valid pointer for the pointer/size pair structure? To me it seems any address is a potentially valid pointer to a zero-sized object. Do allocators really keep track of these null allocations? That would require keeping state for every single address in the worst case...

It's very strange. I wrote my own memory allocator and I can't figure out the right way to handle this. Eliminating the need for these "technically" valid pointers that can't actually be accessed because they're zero sized seems like the better solution.

> When did that happen?

More importantly, why did that happen? People have told me that I should care about the C standards committee because they take backwards compatibility very seriously. Then they come out with breaking changes like these.

2 comments

> Is there a rationale for a memory allocator to support zero sized allocations?

Mainly, that it has supported that before and programs rely on it.

Programs written to the C99 standard can resize a dynamic vector down to empty with a resize(ptr, 0). The pointer coming from that will be the same as if malloc(0) has been called.

So now, that has been taken away; those programs can now make demons fly out of your nose.

Thank you, ISO C!

> Do allocators really keep track of these null allocations? That would require keeping state for every single address in the worst case...

Implementations of malloc(0) that don't return null are required to return a unique object. To do that, all they have to do is pretend that the size is some nonzero value like 1 byte. (The application must not assume that there is any byte there that can be accessed).

> Programs written to the C99 standard can resize a dynamic vector down to empty with a resize(ptr, 0).

C99 has no resize() function. Assuming you mean realloc(), C99 does not guarantee you can use realloc() in this manner.

See also:

https://news.ycombinator.com/item?id=38850575

https://stackoverflow.com/questions/16759849/using-realloc-x...

https://wiki.sei.cmu.edu/confluence/plugins/servlet/mobile?c...

https://developers.redhat.com/articles/2023/07/26/checking-u...

> C99 does not guarantee you can use realloc() in this manner

Yes it does. It requires support for reallocing down to zero, which results in an object that is like one that comes from malloc(0).

(What some people think is that realloc(x, 0) is equivalent to free(x). It isn't. Resizing down to zero isn't freeing. It might be, if malloc(0) doesn't allocate anything and just returns null. Why some people think realloc(x, 0) is free(x) is that they read realloc man page from the Linux man-pages project which says such a thing.)

realloc(ptr, 0) could fail to free ptr, in the situation that allocating the zero-sized replacement object fails. In that case, null could be returned, leaving the old object valid. This is ambiguous, because null could also be the happy case return value when the old object was freed and the zero-sized allocation deliberately produced null. Under those conditions, the cases in which there is a memory leak are indistinguishable from the ones in which there isn't.

(I'd rather suffer a memory leak in the OOM condition, than have previously defined behavior gratuitously flip to undefined.)

I literally quoted someone from WG14.

HN really needs the ability to block trolls.

You quoted someone who stated they are not happy with C23, and deflected personal blame for that issue.

If I'm only trolling, then why, having learned about this, am I having to go into code and make defensive fixes?

They aren't defensive foxes. They're just fixes. Any C99+ code that relied on your assumed realloc(p, 0) behavior was always incorrect.
Indeed, you couldn't reliably free() the old pointer if the realloc(ptr, 0) failed.

But xrealloc(ptr, 0) (or equivalent) would still be perfectly consistent, assuming you trust your implementation to support non-null 0-size allocations in the first place. It's very common to just "leak it all and abort" on a critical error like memory exhaustion. There's a reason most non-C languages expose an infallible allocation API as the default option.

I do think that UB is an overly heavy hammer for realloc(ptr, 0), since the xrealloc(ptr, 0) use case works just as well regardless of how unspecified the values of the old pointer or errno are on failure.

Yes. If realloc(ptr, 0) returns a null pointer, you don't know whether that's due to a failure (in which case ptr is still valid) or whether it's the happy case (ptr was freed, and the zero-sized request for replacing it produced a null). Thus you don't know whether ptr is still a valid pointer. If it's valid and you treat it as invalid (hands off), that's a leak. If it's invalid and you treat it as valid (free it), that's a double free.
I'm not talking about implementations that produce a 'successful' null pointer. I'd consider that a quality-of-implementation issue, in that implementations are responsible for returning non-null on 0-size success in the same way they're responsible for not just stubbing out every single malloc() call, so just assuming that a null output indicates failure is appropriate. (Implementations transitioned ages ago toward returning non-null for 0-size requests for good reason!)

Instead, the problem is about a realloc(ptr, size) that returns null to indicate failure. If size > 0, then the data behind ptr remains unmodified and can be later freed. But if size == 0 (and the 0-size allocation fails), then the data behind ptr is unconditionally freed according to many implementations.

This makes it unsafe to access the data behind ptr after a realloc() failure, unless you've checked that size > 0. But I argue that by making the whole thing UB instead of leaving it sufficiently unspecified, the xrealloc(ptr, size) use case that doesn't care about the leak on failure is made more complicated unnecessarily.

In my well-informed, expert opinion backed by decades of experience, it would have been best to add this wording:

"When size is zero, the realloc function shall free the original object, regardless of whether allocating the new object is successful, and thus regardless of the value returned."

With a footnote explaining the ambiguity that exists otherwise, and that existed historically.

A small change in some implementations here would be better than taking a wrecking ball to defined behavior.

If storing the metadata in the heap, 0 bytes often doesn't even end up a special case. You need to have a case for allocations of some arbitrary number of bytes, and 0 is an arbitrary number of bytes.

Another option is to treat them as being of size 1.

(In theory you could do endless allocations of size 0, and eventually you'd run out of space, even though you've allocated 0 bytes in total. But you end up in exactly that situation, whatever the allocation size, if you don't take bookkeeping overhead into account!)