Hacker News new | ask | show | jobs
by jstimpfle 1292 days ago
This is hilarious. SIZE_MAX is at least as large as the largest string that you can put in your address space / memory anyway. Which is what the strlen() API already assumes.

That, plus you'd be a fool to store a huge string in this way anywhere (in or out of memory) in any case.

2 comments

> SIZE_MAX is at least as large as the largest string that you can put in your address space / memory anyway.

Not necessarily. A 64-bit system could give processes an address space that’s significantly larger than half the full 64-bit address space and have an allocator that allows you to allocate a block of more than SIZE_MAX bytes (malloc takes a size_t, but you can use calloc)

This doesn't make sense to me. You can't "allocate" more than SIZE_MAX bytes by definition. If you take "allocate" to mean "make it available in the process's address space", that is.
Are you sure?

The calloc() [1] function mentioned above takes two values of type size_t, and allocates their product bytes.

I'm on mobile without (!) the C99 draft spec but at least the man page gives no such restriction.

[1] https://linux.die.net/man/3/calloc

How would it be possible to allocate more address space than is addressable?

calloc returns NULL when can't satisfy the request. The idea of taking two arguments is not to allow the user to specify a larger requested size, but to protect against overflows as it can happen with e.g. malloc() where the user has to compute the size of arrays by multiplying NUM_ELEMS * SIZE_PER_ELEM. And the user will normally do so less carefully than a library function.

I read something about this recently, somewhere, maybe HN. Specifically, in calloc(), what is done and what should really be done if the multiplication overflows. As will happen, for example, if you try to calloc() two elements of size SIZE_MAX, when SIZE_MAX is the maximum representable unsigned integer value on the machine. So, I don't think calloc() is available or intended as a way to circumvent malloc()'s size restriction.
I stand corrected. Initially, I thought that, even if it calloc can’t, an OS could provide a different way to obtain a pointer to a memory region that’s larger than SIZE_MAX.

However, the standard says (https://en.cppreference.com/w/c/types/size_t):

“size_t can store the maximum size of a theoretically possible object of any type (including array).”

and (https://en.cppreference.com/w/c/language/pointer):

“Pointer is a type of an object that refers to a function or an object of another type, possibly adding qualifiers. Pointer may also refer to nothing, which is indicated by the special null pointer value.”

⇒ pointers must either be null or point to an object, and objects aren’t larger than SIZE_MAX, so I think having a pointer pointing to a block larger than SIZE_MAX violates the standard.

> pointer pointing to a block larger than SIZE_MAX violates the standard.

it's simply not possible, by definition.

size_t is unsigned, right? ssize_t is the signed version?

On a quick test on my 64-bit system, a C program doing `printf("%zu\n", SIZE_MAX);` outputs 18446744073709551615, which looks like (2^64)-1 to me.

Or is there a thing in the standard that says this isn't always the case?

No, ssize_t is not the signed version. As best as I can tell, the only things POSIX says about ssize_t is that[1] it is an integer type that can hold integer values in [-1, SSIZE_MAX], where[2] SSIZE_MAX ≥ _POSIX_SSIZE_MAX = 32767, not that it should have any particular relation to size_t. In the standard, it is used for byte counts in I/O, like the return value of read() (traditionally int), for the return value of strfmon() and strfmon_l() (OK I guess, though the C standard stuck with int for *printf()), and for the argument to swab() (wat).

Note that neither is ptrdiff_t guaranteed to be that signed version, or to hold any possible value in the domain of size_t or (strictly speaking) any possible object size. Both GCC and Clang assume the latter, though, and can miscompile[3] code that relies on (e.g.) malloc() succeeding for sizes > 2^31 on a 32-bit system.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sy...

[2] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/li...

[3] https://trust-in-soft.com/blog/2016/05/20/objects-larger-tha...

ssize_t is a weird one, the only negative value it is guaranteed to store is -1.

> The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}].

[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sy...

size_t need only be large enough to cover the (virtual) address space. It's up to hardware and OS to decide how much addressable space you get. I believe current systems can use only the low 48 bits of 64-bit pointers. However that number is likely to be increased in the future and OSes would be unwise to define size_t as something smaller than 64 bits.
Isn't size_t defined as being able to fit the largest possible data allocation?
Indeed, you just need to forget to put a terminator to get a nice memory dump.