Hacker News new | ask | show | jobs
by planede 1415 days ago
What's wrong with VLAs is their syntax. It really shouldn't use the same syntax as regular C arrays, otherwise they would be fine, maybe with a scary enough keyword. They are more generic than alloca too, alloca being scoped to the function, while VLAs being scoped to the innermost block scope that contains them.
1 comments

Syntax, no protection against stack corruption,...
You can corrupt the stack without VLAs just fine. What else?
VLAs make it a lot easier to corrupt the stack by accident. Unless you're quite a careful coder, stuff like:

  f (size_t n)
  {
    char str[n];
leads to a possible exploit where the input is manipulated so n is large, causing a DoS attack (at best) or full exploit at worse. I'm not saying that banning VLAs solves every problem though.

However the main reason we forbid VLAs in all our code is because thread stacks (particularly on 32 bit or in kernel) are quite limited in depth and so you want to be careful with stack frame size. VLAs make it harder to compute and thus check stack frame sizes at compile time, making the -Wstack-usage warning less effective. Large arrays get allocated on the heap instead.

> stuff like ... leads to a possible exploit where the input is manipulated so n is large

The same is true for most recursive calls, should recursion be also banned in programming languages?

When writing secure C? In most cases, absolutely.
That's not really a fair comparison though. Recursion is strictly necessary to implement several algorithms. Even if "banned" from the language, you would have to simulate it using a heap allocated stack or something to do certain things.

None of this applies to VLA arguments.

It's not strictly necessary precisely because all recursions can be "simulated" with a heap allocated stack. And in fact, the "simulated" approach is almost always better, from both a performance and a maintenance perspective.
MISRA C bans recursion for instance.
Doesn't a similar DoS risk (from allowing users to allocate arbitrarily large amounts of memory) also apply to the heap? You shouldn't be giving arbitrary user-supplied ints to malloc either.
> Doesn't a similar DoS risk (from allowing users to allocate arbitrarily large amounts of memory) also apply to the heap?

DoS Risk? No one cares too much about that - the problem with VLAs is stack smashing, which then allows aribtrary user-supplied code to be executed.

You cannot do that with malloc() and friends.

VLAs don’t smash the stack.
How does a huge VLA corrupt the stack? If there's not enough space but code keeps going then isn't that a massive bug with your compiler or runtime?
Okay. How do you tell the kernel that? Sure, the kernel will have put a guard page or more at the end of the stack, so that if you regularly push onto the stack, you will eventually hit a guard page and things will blow up appropriately.

But what if the length of your variable length array is, say, gigabytes, you've blown way past the guard pages, and your pointer is now in non-stack kernel land.

You'd have to check the stack pointer all the time to be sure, that's prohibitive performance-wise. Ironically, x86 kind of had that in hardware back when segmentation was still used.

I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation. There's no need to check the stack pointer all the time.

But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.

Welcome to the world of undefined behavior. Anything can happen....
I think this is a common misunderstanding about UB. It's not that anything can happen, just that the standard doesn't specify what happens, meaning whatever happens is compiler/architecture/OS dependent. So you can't depend on UB in portable code. But something definite will happen, given the current state of the system. After all, if it didn't, these things wouldn't be exploitable either.
What is undefined about a large VLA? It shouldn't be undefined.

According to wikipedia "C11 does not explicitly name a size-limit for VLAs"

The C standard has no mentions of a program stack. This isn’t undefined behavior.
You shouldn't be writing C if you're not a careful coder.
Yeah, right.

https://msrc-blog.microsoft.com/2019/07/16/a-proactive-appro...

https://research.google/pubs/pub46800/

https://support.apple.com/guide/security/memory-safe-iboot-i...

Maybe you could give an helping hand to Microsoft, Apple and Google, they are in need of carefull C coders.

I'm not sure if you intentionally missed my point. Everything in C requires careful usage. VLAs aren't special: they're just yet another feature which must be used carefully, if used at all.

Personally, I don't use them, but I don't find "they're unsafe" to be a convincing reason for why they shouldn't be included in the already-unsafe language. Saying they're unnecessary might be a better reason.

And if you're a careful coder writing C, you should give the VLA the stink eye unless it's proving its worth.
Hint, that means nobody should be writing C.
Where is the lie?
Too bad we have all that legacy C code that won't just reappear by itself on a safer language.

That means there are a lot of not careful enough developers (AKA, human ones) that will write a lot of C just because they need some change here or there.

With VLAs:

1. The stack-smashing pattern is simple, straightforward and sure to be used often. Other ways to smash the stack require some more "effort"...

2. It's not just _you_ who can smash the stack. It's the fact that anyone who calls your function will smash the stack if they pass some large numeric value.

They can overflow the stack. They cannot smash the stack.
Fair enough; I had the mistaken idea that the two terms are interchangeable, but apparently stack smashing is only used for the attack involving the stack:

https://en.wikipedia.org/wiki/Stack_buffer_overflow

so, pretend I said "overflow" instead of "smash" in my post.

Useless semantic pedantry at best, but arguable wrong as there isn't some sort of ISO standard on dumb hacking terms.
Overflowing the stack gives you a segfault. Smashing the stack lets hackers pop a shell on your computer. They are incredibly different. VLAs can crash your program, but they do not give attackers the ability to scribble all over the stack.
What about not adding even more ways how we should avoid using C?
> What about not adding even more ways how we should avoid using C?

That's a mute point for C's target audience because they already understand that they need to be mindful of what the language does.

What the heck. It's "moot", not "mute".
I'm curious, are there accents in which those two words are homophones? Given the US tendency to pronounce new/due/tune as noo/doo/toon I can imagine some might say mute as moot but I can't find anything authoritative online.
That is like saying if sushi knifes are already sharp enough, there is no issue cutting fish with a samurai sword instead, except at least with the knife maybe the damage isn't as bad.
The difference between the largest sushi knives and a katana is more about who wields them than the blade involved.
> That is like saying if sushi knifes are already sharp enough (...)

No, it's like saying that professional people understand the need to learn what their tools of the trade do beyond random stackoverflow search on how to print text to stdout.

It seems you have an irrational dislike of C. That's perfectly ok. No need to come up with excuses though.

I like your comparison of a C programmer with a samurai.
VLAs are no more unsafe than standard C is for stack corruption.
Just one additional attack vector more to add to the list, who's still counting them?
It’s not an additional attack vector.

    int A[100000000];
Also has no protection.