Okay. How do you tell the kernel that? Sure, the kernel will have put a guard page or more at the end of the stack, so that if you regularly push onto the stack, you will eventually hit a guard page and things will blow up appropriately.
But what if the length of your variable length array is, say, gigabytes, you've blown way past the guard pages, and your pointer is now in non-stack kernel land.
You'd have to check the stack pointer all the time to be sure, that's prohibitive performance-wise. Ironically, x86 kind of had that in hardware back when segmentation was still used.
I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation. There's no need to check the stack pointer all the time.
But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.
> I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation.
What exactly are you doing there, in kernel code?
> But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.
Do what exactly? Just reject stack allocations that are larger than the cluster of guard pages? And keep book of past allocations? A lot of that needs to happen at runtime, since the compiler doesn't know the size with VLAs.
It's not impossible and mitigations exist, but it is pretty "extra". gcc has -fstack-check that (I think) does something there.
> What exactly are you doing there, in kernel code?
In kernel code?
What you're doing is triggering the guard page over and over if the stack is pushing into new territory.
> Do what exactly? Just reject stack allocations that are larger than the cluster of guard pages? And keep book of past allocations? A lot of that needs to happen at runtime, since the compiler doesn't know the size with VLAs.
Just hit the guard pages. You don't need to know the stack size or have any bookkeeping to do that, you just prod a byte every page_size. And you only need to do that for allocations that are very big. In normal code it's just a single not-taken branch for each VLA.
Because it's on by default in MSVC [0], and we all know that whatever technical decisions MS makes, they're superior to whatever technical decision the GNU people make. /s
An attacker would first trigger a large VLA-allocation that puts the stack pointer within a few bytes of the guard page. Then they would just have the kernel put a return address or two on the stack and that would be enough to cause a page fault. The only way to guard against that would be to check that every CALL instruction has enough stack space which is infeasible.
But that's the entire point of the guard page, it causes a page fault. That's not corruption.
Denial of service by trying to allocate something too big for the stack is obvious. I'm asking about how corruption is supposed to happen on a reasonable platform.
An attacker could trigger a large VLA allocation that jumps over the guard page, and a write to that allocation. That write would start _below_ the guard page, so damage would be done before the page fault occurs (ideally, that write wouldn’t touch the guard page and there wouldn’t be a page fault but that typically is harder to do; the VLA memory allocation typically is done to be fully used)
Triggering use of the injected code may require another call timed precisely to hit the changed code before the page fault occurs.
Of course, the compiler could and should check for stack allocations that may jump over guard pages and abort the program (or, if in a syscall, the OS) or grow the stack when needed. Also, VLAs aren’t needed for this. If the programmer creates a multi-megabyte local array, this happens, too (and that can happen accidentally, for example when increasing a #define and recompiling)
The lesson is, though, that guard pages alone don’t fully protect against such attacks. The compiler must check total stack space allocated by a function, and, if it can’t determine that that’s under the size of your guard page, insert code to do additional runtime checks.
I don’t see that as a reason to outright ban VLAs, though.
I think this is a common misunderstanding about UB. It's not that anything can happen, just that the standard doesn't specify what happens, meaning whatever happens is compiler/architecture/OS dependent. So you can't depend on UB in portable code. But something definite will happen, given the current state of the system. After all, if it didn't, these things wouldn't be exploitable either.
> But something definite will happen, given the current state of the system.
This is only true in the very loose and more or less useless sense that the compiler is definitely going to emit some machine code. What does that machine code do in the UB case? It might be absolutely anything.
One direction you could go here is you insist that surely the machine code has a defined meaning for all possible machine states, but that's involving a lot of state you aren't aware of as the programmer, and it's certainly nothing you can plan for or anticipate so it's essentially the same thing as "anything can happen".
Another is you could say, no, I'm sure the compiler is obliged to put out specific machine code, and you'd just be wrong about that, Undefined Behaviour is distinct from Unspecified Behaviour or merely Platform Dependant behaviour.
Many C and C++ programmers have the mistaken expectation that if their program is incorrect it can't do anything really crazy, like if I never launch_missiles() surely the program can't just launch_missiles() because I made a tiny mistake that created Undefined Behaviour? Yes, it can, and in some cases it absolutely will do that.
I'm aware you can get some pretty crazy behaviours, say if you end up overwriting a return address and your code begins to jump around like crazy. Even that could reproduce the same behaviour consistently though.
I once had a bug like that in a piece of AVR C code where the stack corruption would happen in the same place every time and the code would pathologically jump to the same places in the same order every time. It's worth noting though that when there's an OS, usually what will happen is just a SIGABRT. See the OpenBSD libc allocator for a masterclass in making misbehaving programs crash.
I was never advocating to rely on UB, btw. But yes, UB can be understood in many cases.
You are confusing the C standard and actual platforms/C implementations. A lot of things are UB in the standard but perfectly well defined on your platform. Standards don’t compile code, real compilers do. The standard doesn’t provide standard library implementations, the actual platform does.
Targeting the standard is nice, but if all of your target platforms guarantee certain behaviors, you might consider using those. A lot of UB in the C standard is perfectly defined and consistent across MSVC, GCC, Clang, and ICC.
> A lot of UB in the C standard is perfectly defined and consistent across MSVC, GCC, Clang, and ICC.
Do you have examples of this "a lot of UB in the C standard" which is in fact guaranteed to be "perfectly defined and consistent" across all the platforms you listed ? You may need to link the guarantees you're relying on.
Okay so take the two most complained about UBs, improper aliasing and signed integer overflow. Every compiler I’ve ever used lets you turn both into defined behavior.
Oh really? Then why does every compiler I use have a parameter to turn off strict aliasing?
You cite to a source that contradicts you. In the llvm blog post: "It is also worth pointing out that both Clang and GCC nail down a few behaviors that the C standard leaves undefined."
But what if the length of your variable length array is, say, gigabytes, you've blown way past the guard pages, and your pointer is now in non-stack kernel land.
You'd have to check the stack pointer all the time to be sure, that's prohibitive performance-wise. Ironically, x86 kind of had that in hardware back when segmentation was still used.