That question doesn't make any sense. They don't do the same thing, so you can't compare their relative performance. What's faster: an Intel i7 or a BMW i8?
Malloc is potentially troublesome for two reasons. First, its performance is potentially unpredictable. It depends on the current state of the heap at the time of the call, which you can't know in advance except in some very rare situations. It can also fail entirely, and that is likewise unpredictable.
memcpy and memmove, ultimately being byte-copying loops, don't suffer from these problems. Their performance is consistent and they always succeed if your pointers and lengths are valid.
On PCs these days, the troubles of malloc don't matter much. You have so much performance margin that occasional slow calls don't matter, and virtual memory with a big address space means that it almost never fails. If it does fail, it's OK if the program crashes and you have to restart it. But many systems are much more constrained.
Stack usage is also much, much, much easier to characterize. In systems where stack depth is well-controlled (i.e. most embedded systems that don't have dynamic process/thread creation), very simple analysis will suffice to identify places where you blow your stack.
Not using the heap != everything is allocated on the stack. In situations where you want to avoid dynamic allocation, memory for most things that would otherwise have been dynamically allocated ends up being statically allocated at compile time.
No, but you're a lot more likely to be able to overwrite the return address via a stack-based buffer overflow, and that is generally a much more serious kind of attack.
and moves and copies won't have potential for other memory management bugs? You'll still have memory management overhead and now additional complexity.
At a high level, isn't this like implementing your own "malloc" and "free" that just pulls from your process's own memory pool instead of the OS? Or is there more to it than that?
No, it's just placing the appropriate structs and buffers on the stack (when not provided by the caller).
It does eliminate a certain couple classes of errors, and makes some others less likely.
I didn't read all the code, but I don't think it's using alloca or the like. So the stack allocation sizes are known at compile time, and bounded unless there's some recursion going on (which is unlikely).
Many real time systems and applications disallow heap usage, because they have formal verification requirements that can't be met with dynamic memory that may "run out" depending on run time state.
Malloc is potentially troublesome for two reasons. First, its performance is potentially unpredictable. It depends on the current state of the heap at the time of the call, which you can't know in advance except in some very rare situations. It can also fail entirely, and that is likewise unpredictable.
memcpy and memmove, ultimately being byte-copying loops, don't suffer from these problems. Their performance is consistent and they always succeed if your pointers and lengths are valid.
On PCs these days, the troubles of malloc don't matter much. You have so much performance margin that occasional slow calls don't matter, and virtual memory with a big address space means that it almost never fails. If it does fail, it's OK if the program crashes and you have to restart it. But many systems are much more constrained.