Hacker News new | ask | show | jobs
by simiones 22 days ago
> No, if we are using the definition of an array that is like int c[] = ..., that is always going to be on the stack. Heap continuous memory =/= array. You can use the [] operator to access it like an array, but fundamentally, as far as structures in C language are concerned, those 2 are different, because they get treated by compiler differently.

Well, not necessarily. For one thing, if we have a function foo(int c[]), it's debatable if c is an array variable or a pointer variable. However, what's not debatable is that you can allocate a struct on the heap, and that struct can have an array member - e.g. `struct foo { int a[10]; }; [...] struct foo *x = malloc(sizeof(struct foo));` would allocate an array on the heap as part of the struct.

> That would only be true if each element in the array was a char.

That's why I said that it depends on what exactly you mean by the size of the array. It's also true that in today's world at least, malloc() will often allocate more memory than you actually ask for, to optimize against fragmentation - and then the internally stored size is the size of the actual allocation, not the logical size that you requested - which may not even fit into a whole number of array elements. So, I was being a little overly simplistic (lying) for dramatic effect.

> For example, a really good practice in C coding that basically solves any double free is a mempool that allocates all the memory up front.

While this is a very valid technique for certain purposes, especially when dynamic allocation is needed in very high performance code, it's very much not a valid solution for memory safety - quite the contrary, it's a terrible practice for that. In particular, this is almost exactly the issue that caused the infamous HeartBleed vulnerability in OpenSSL to stay hidden for so long: the use of a memory pool for the buffers used to store TLS packets was hiding the buffer overflow from UBSan and valgrind and similar tools, since the reads were perfectly valid from a language perspective (they were never reading from free()d/unallocated memory, only from memory that had been released to the memory pool).

1 comments

Its all about what the compiler sees.

Structs are a defined type, which means its construction (and therefore total size) has to be known , the array definition with size is necessarily part of that struct type. So anytime that struct is used, the compiler needs to see its definition, and thus can safely infer the size. Thats pretty much the whole reason structs are a thing - the very basic type that allows you to pass around data format during the compilation process.

Arrays are not defined as types in C, they are really at most just syntax convenience. So if another function takes an array as a parameter, and it gets compiled as part of a file, there is no way for the compiler to auto infer what would get passed into it.

Char allocation usually involves +1 bytes for null terminated strings, which is used as a signal for allocated memory. So strlen(char *) is accurate.

>quite the contrary, it's a terrible practice for that. In particular, this is almost exactly the issue that caused the infamous HeartBleed vulnerability in OpenSSL to stay hidden for so long: the use of a memory pool for the buffers used to store TLS packets

The heartbleed vulnerability was not due to mempool. It was due to a combination of lack of bounds checking, and not zeroing out the memory containing secure keys when its deallocated. Even if it didn't use mempool, leaks would still be possible.

Even for char*, it's very possible that malloc() will store more memory than strictly required. But you're right, `char x[] = "abc"` will require a minimum of 4 bytes wherever x gets allocated (stack or global segment).

> The heartbleed vulnerability was not due to mempool. It was due to a combination of lack of bounds checking, and not zeroing out the memory containing secure keys when its deallocated. Even if it didn't use mempool, leaks would still be possible.

I didn't say that the bug was caused by the mempool, I said that the bug was very hard to find by regular tools such as valgrind and UBSan because it used mempools instead of regular allocations - so that all of the logical out of bounds accesses were not actually UB nor were they accessing unallocated memory, which those tools could have caught.