Hacker News new | ask | show | jobs
by __s 1172 days ago
tl;dr `realloc(p, 0)` is slated to be undefined behavior in C23, whereas it's been somewhat implementation defined until now, with recommendation being realloc(p, 0) is equivalent to free(p)

Seems a bit tone deaf to create new undefined behavior in memory handling, especially when a sane default behavior seems to be de facto

I've used that free-on-0 behavior myself. Unfortunately the code that uses this will often have 0 be a length variable, so hard to grep for this. Ideally musl/glibc will both stick to that undefined behavior being free & gcc/clang won't go about making this something to point their optimizations at

Lest we have to stop using realloc outside of a safe_realloc wrapper

  static void *safe_realloc(void *p, size_t newlen)
  {
    if (newlen == 0) { free(p); return NULL; }
    return realloc(p, newlen);
  }
What got this whole thing weird is that C doesn't like zero sized objects, but implementations were allowed to return a unique pointer for a zero sized allocation. Which then raises the matter that being portable there require freeing that reserved chunk for non-free implementations. In theory this reservation code could be more efficient when code frequently reallocates between 0 & some small value. & there was uncertainty because NULL is a way to say allocation failure, but then if one did a NULL check on realloc's return value they also had to check that the size was non-zero
3 comments

> Seems a bit tone deaf to create new undefined behavior in memory handling,

It's only tone deaf to people who understand "undefined behavior" as an epithet or as synonymous with giving a license to compilers to screw you over. The term doesn't have either of those meaning to those on the C committee. In fact, one of the explicit rationales for the proposal is that, "Classifying a call to realloc with a size of 0 as undefined behavior would allow POSIX to define the otherwise undefined behavior however they please." https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2464.pdf

> especially when a sane default behavior seems to be de facto

The above proposal, N2464, gives the behavior for AIX, zOS, BSD (unspecified), MSVC (crt unspecified), and glibc. They each have different behaviors.

Why they chose to finally make it undefined (it was marked as obsolescent for a long time) rather than keep it as implementation-defined, I don't know. Perhaps because it 1) simplifies the standard, and 2) by making it undefined it suggests compilers should start warning about it--despite all this time neither has there arisen a consensus among implementations about the best behavior, nor are programmers aware that the behavior actually varies widely.

EDIT: The draft SUSv5/POSIX-202x standard has indeed directly addressed this issue. See, e.g., https://www.austingroupbugs.net/view.php?id=374 The most recent draft included the following addition to RETURN VALUE:

  OB     If size is 0,
  OB CX  or either nelem or elsize is 0,
  OB     either:

  OB     * A null pointer shall be returned
  OB CX    and, if ptr is not a null pointer, errno shall be set to [EINVAL].

  OB     * A pointer to the allocated space shall be returned, and the memory object pointed to by ptr
           shall be freed. The application shall ensure that the pointer is not used to access an object.
CX marks points of divergence with C17. The first CX is because of the addition of reallocarray, absent from C17. The second is because POSIX will mandate the setting of EINVAL if NULL is returned.
>It's only tone deaf to people who understand "undefined behavior" as an epithet or as synonymous with giving a license to compilers to screw you over. The term doesn't have either of those meaning to those on the C committee.

It's unfortunate but not surprising that the C committee isn't aware of the problems with the undefined behavior.

In fact, after I started reading WG14 meetings minutes, I completely lost faith that any of the serious problems with the standard will ever get fixed.

This is not a problem with the committee and is not a problem with compiler writers. The committee is only marking certain behaviors as UB. Compilers can do what they think is more sensible in these situations. And compiler writers are not forcing you to accept these extreme optimizations. You always have the option of disabling optimizations and accept that your code has bugs (UB). You just need to test the code you write under different compiler settings, similarly to how you test code in different environments.
"just disable optimizations" is not a solution unless the compiler allows enough fine grained control where that solution is `-ffree-zero-sized-realloc`
> It's only tone deaf to people who understand "undefined behavior" as an epithet or as synonymous with giving a license to compilers to screw you over.

Unfortunately, this is the correct understanding of UB.

realloc to 0 size being free is useful in particular because it means a function pointer to realloc is a complete memory allocator: call realloc with pointer NULL to get malloc, and call realloc with size 0 to get free.
> What got this whole thing weird is that C doesn't like zero sized objects, but implementations were allowed to return a unique pointer for a zero sized allocation.

Some of the windows API's work like this, so how much is pressure from MS?

Same discussion from 7 months ago.

https://news.ycombinator.com/item?id=32352965

https://thephd.dev/c23-is-coming-here-is-what-is-on-the-menu...

https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2897.htm

Pattern matching ram for variables/objects whilst they exist even if zero'ed or prefilled with a value doesnt give perfect security. Random values would make it harder to work out the variable/object.