| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by mh7 1292 days ago
	strlen() returns a size_t so you're already constrained to a maximum length of SIZE_MAX.

2 comments

jstimpfle 1292 days ago

This is hilarious. SIZE_MAX is at least as large as the largest string that you can put in your address space / memory anyway. Which is what the strlen() API already assumes.

That, plus you'd be a fool to store a huge string in this way anywhere (in or out of memory) in any case.

link

Someone 1292 days ago

> SIZE_MAX is at least as large as the largest string that you can put in your address space / memory anyway.

Not necessarily. A 64-bit system could give processes an address space that’s significantly larger than half the full 64-bit address space and have an allocator that allows you to allocate a block of more than SIZE_MAX bytes (malloc takes a size_t, but you can use calloc)

link

jstimpfle 1292 days ago

This doesn't make sense to me. You can't "allocate" more than SIZE_MAX bytes by definition. If you take "allocate" to mean "make it available in the process's address space", that is.

link

unwind 1291 days ago

Are you sure?

The calloc() [1] function mentioned above takes two values of type size_t, and allocates their product bytes.

I'm on mobile without (!) the C99 draft spec but at least the man page gives no such restriction.

[1] https://linux.die.net/man/3/calloc

link

jstimpfle 1291 days ago

How would it be possible to allocate more address space than is addressable?

calloc returns NULL when can't satisfy the request. The idea of taking two arguments is not to allow the user to specify a larger requested size, but to protect against overflows as it can happen with e.g. malloc() where the user has to compute the size of arrays by multiplying NUM_ELEMS * SIZE_PER_ELEM. And the user will normally do so less carefully than a library function.

link

mek6800d2 1291 days ago

I read something about this recently, somewhere, maybe HN. Specifically, in calloc(), what is done and what should really be done if the multiplication overflows. As will happen, for example, if you try to calloc() two elements of size SIZE_MAX, when SIZE_MAX is the maximum representable unsigned integer value on the machine. So, I don't think calloc() is available or intended as a way to circumvent malloc()'s size restriction.

link

Someone 1291 days ago

I stand corrected. Initially, I thought that, even if it calloc can’t, an OS could provide a different way to obtain a pointer to a memory region that’s larger than SIZE_MAX.

However, the standard says (https://en.cppreference.com/w/c/types/size_t):

“size_t can store the maximum size of a theoretically possible object of any type (including array).”

and (https://en.cppreference.com/w/c/language/pointer):

“Pointer is a type of an object that refers to a function or an object of another type, possibly adding qualifiers. Pointer may also refer to nothing, which is indicated by the special null pointer value.”

⇒ pointers must either be null or point to an object, and objects aren’t larger than SIZE_MAX, so I think having a pointer pointing to a block larger than SIZE_MAX violates the standard.

link

Karellen 1291 days ago

size_t is unsigned, right? ssize_t is the signed version?

On a quick test on my 64-bit system, a C program doing `printf("%zu\n", SIZE_MAX);` outputs 18446744073709551615, which looks like (2^64)-1 to me.

Or is there a thing in the standard that says this isn't always the case?

link

mananaysiempre 1289 days ago

No, ssize_t is not the signed version. As best as I can tell, the only things POSIX says about ssize_t is that[1] it is an integer type that can hold integer values in [-1, SSIZE_MAX], where[2] SSIZE_MAX ≥ _POSIX_SSIZE_MAX = 32767, not that it should have any particular relation to size_t. In the standard, it is used for byte counts in I/O, like the return value of read() (traditionally int), for the return value of strfmon() and strfmon_l() (OK I guess, though the C standard stuck with int for *printf()), and for the argument to swab() (wat).

Note that neither is ptrdiff_t guaranteed to be that signed version, or to hold any possible value in the domain of size_t or (strictly speaking) any possible object size. Both GCC and Clang assume the latter, though, and can miscompile[3] code that relies on (e.g.) malloc() succeeding for sizes > 2^31 on a 32-bit system.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sy...

[2] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/li...

[3] https://trust-in-soft.com/blog/2016/05/20/objects-larger-tha...

link

arcticbull 1291 days ago

ssize_t is a weird one, the only negative value it is guaranteed to store is -1.

> The type ssize_t shall be capable of storing values at least in the range [-1, {SSIZE_MAX}].

[1] https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sy...

link

jstimpfle 1291 days ago

size_t need only be large enough to cover the (virtual) address space. It's up to hardware and OS to decide how much addressable space you get. I believe current systems can use only the low 48 bits of 64-bit pointers. However that number is likely to be increased in the future and OSes would be unwise to define size_t as something smaller than 64 bits.

link

ahepp 1292 days ago

Isn't size_t defined as being able to fit the largest possible data allocation?

link

pjmlp 1292 days ago

Indeed, you just need to forget to put a terminator to get a nice memory dump.

link

jcelerier 1292 days ago

If you use a different data structure you would maybe use a different API for accessing it too

link