Hacker News new | ask | show | jobs
by shawabawa3 1168 days ago
My guess is it won't always have to be zeroed

e.g. if your code is doing

    ptr = malloc()
    memcpy(mydata, ptr)
You can presumably optimise out the zeroing of the memory
2 comments

As far as I know, the Linux kernel never inspects the userspace thread to adjust behavior based on what the thread is going to do next. This would be a very brittle sort of optimization.

More importantly, it's not safe. Another thread in the same process can see ptr between the malloc and the memcpy!

Edit: also, of course, malloc and memcpy are C runtime functions, not syscalls, so checking what happens after malloc() would require the kernel to have much more sophisticated analysis than just looking a few instructions ahead of the calling thread's %%eip/%%rip. While handling malloc()'s mmap() or brk() allocation, the kernel would need to be able to look one or two call frames up the call stack, past the metadata accounting that malloc is doing to keep track of the newly acquired memory, perhaps look at a few conditional branches, trace through the GOT and PLT entries to see where the memcpy call is actually going, and do so in a way that is robust to changes in the C runtime implementation. (Of course, in practice, most C compilers will inline a memcpy implementation, so in the common case, it wouldn't have to chase the GOT and PLT entries, but even then, it's way too complicated for the kernel to figure out if anything non-trivial is happening between mmap()/brk() and the memory being overwritten.)

Edit 2: To be robust in the completely general case, even if it were trivial to identify the inlined memcpy implementation, and it were clearly defined "something non-trivial happens", determining if "something non-trivial happens" between mmap()/brk() and memcyp() would involve solving the halting problem. (Imposssible in the general case.)

I'm NOT an expert here, but offhand.

  malloc() == 'reservation' (but not paged in!) memory
  // If touched / updated THEN the memory's paged in
A copy _might_ not even become a copy if the kernel's smart enough / able to setup a hardware trigger to force a copy on writes to that area, at which point the physical memory backing two distinct logical memory zones would be copied and then different.
That's a good point that Linux doesn't actually allocate the pages until they're faulted in by a read or write. So, if it were doing some kind of thread inspection optimization, it would presumably just need to check if the faulting thread is currently in a loop that will overwrite at least the full page.

However, that wouldn't solve the problem of other threads in the same process being able to see the page before it's fully overwritten, or debugging processes, or using a signal handler to invisibly jump out of the initialization loop in the middle, etc. There are workarounds to all of these issues, but they all have performance and complexity costs.

malloc gets memory from the heap which may or may not be paged in/reused. That means you may get reused memory from the heap (which is up to the CRT).

If you want make sure it is zero you will want calloc. If you know you are going to copy something in on the next step like your example you probably can skip calloc and just us malloc. calloc is nice for when you are doing thigs like linked lists/trees/buffers and do not want to have steps to clean out the pointers or data.