| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pjmlp 1415 days ago
	Syntax, no protection against stack corruption,...

3 comments

planede 1415 days ago

You can corrupt the stack without VLAs just fine. What else?

link

rwmj 1415 days ago

VLAs make it a lot easier to corrupt the stack by accident. Unless you're quite a careful coder, stuff like:

  f (size_t n)
  {
    char str[n];

leads to a possible exploit where the input is manipulated so n is large, causing a DoS attack (at best) or full exploit at worse. I'm not saying that banning VLAs solves every problem though.

However the main reason we forbid VLAs in all our code is because thread stacks (particularly on 32 bit or in kernel) are quite limited in depth and so you want to be careful with stack frame size. VLAs make it harder to compute and thus check stack frame sizes at compile time, making the -Wstack-usage warning less effective. Large arrays get allocated on the heap instead.

link

zajio1am 1415 days ago

> stuff like ... leads to a possible exploit where the input is manipulated so n is large

The same is true for most recursive calls, should recursion be also banned in programming languages?

link

rhexs 1415 days ago

When writing secure C? In most cases, absolutely.

link

mtlmtlmtlmtl 1415 days ago

That's not really a fair comparison though. Recursion is strictly necessary to implement several algorithms. Even if "banned" from the language, you would have to simulate it using a heap allocated stack or something to do certain things.

None of this applies to VLA arguments.

link

10000truths 1415 days ago

It's not strictly necessary precisely because all recursions can be "simulated" with a heap allocated stack. And in fact, the "simulated" approach is almost always better, from both a performance and a maintenance perspective.

link

mtlmtlmtlmtl 1415 days ago

This is simply nonsense. In cases with highly complex recursive algorithms, "unrecursing" would make the code a completely unmaintainable mess, requiring an immensely complicated state machine, which is why something like Stockfish doesn't do that in its recursive search function even though the code base is extremely optimised. And yes, some algorithms are inherently recursive, and don't gain any meaningfull performance from the heap stack + state machine approach.

link

msla 1415 days ago

> It's not strictly necessary precisely because all recursions can be "simulated" with a heap allocated stack.

This just moves the problem from a stack blowout to a heap blowout.

> And in fact, the "simulated" approach is almost always better, from both a performance and a maintenance perspective.

I am unsure about the performance, but turning recursive code implementing a recursive procedure into iterative code which has to maintain a stack by hand cannot possibly improve readability unless the programmers involved are pathologically afraid of seeing recursive code.

link

saagarjha 1415 days ago

Definitely not.

link

Leherenn 1415 days ago

MISRA C bans recursion for instance.

link

jasonhansel 1415 days ago

Doesn't a similar DoS risk (from allowing users to allocate arbitrarily large amounts of memory) also apply to the heap? You shouldn't be giving arbitrary user-supplied ints to malloc either.

link

lelanthran 1415 days ago

> Doesn't a similar DoS risk (from allowing users to allocate arbitrarily large amounts of memory) also apply to the heap?

DoS Risk? No one cares too much about that - the problem with VLAs is stack smashing, which then allows aribtrary user-supplied code to be executed.

You cannot do that with malloc() and friends.

link

saagarjha 1415 days ago

VLAs don’t smash the stack.

link

pjmlp 1415 days ago

Depends on the number you put inside, and the linker settings for stack size.

link

Dylan16807 1415 days ago

How does a huge VLA corrupt the stack? If there's not enough space but code keeps going then isn't that a massive bug with your compiler or runtime?

link

anyfoo 1415 days ago

Okay. How do you tell the kernel that? Sure, the kernel will have put a guard page or more at the end of the stack, so that if you regularly push onto the stack, you will eventually hit a guard page and things will blow up appropriately.

But what if the length of your variable length array is, say, gigabytes, you've blown way past the guard pages, and your pointer is now in non-stack kernel land.

You'd have to check the stack pointer all the time to be sure, that's prohibitive performance-wise. Ironically, x86 kind of had that in hardware back when segmentation was still used.

link

Dylan16807 1415 days ago

I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation. There's no need to check the stack pointer all the time.

But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.

link

anyfoo 1415 days ago

> I think the normal pattern is a stack probe every page or so when there's a sufficiently large allocation.

What exactly are you doing there, in kernel code?

> But that's not my point. If the compiler/runtime knows it will blow up if you have an allocation over 4KB or so, then it needs to do something to mitigate or reject allocations like that.

Do what exactly? Just reject stack allocations that are larger than the cluster of guard pages? And keep book of past allocations? A lot of that needs to happen at runtime, since the compiler doesn't know the size with VLAs.

It's not impossible and mitigations exist, but it is pretty "extra". gcc has -fstack-check that (I think) does something there.

link

petters 1415 days ago

Welcome to the world of undefined behavior. Anything can happen....

link

mtlmtlmtlmtl 1415 days ago

I think this is a common misunderstanding about UB. It's not that anything can happen, just that the standard doesn't specify what happens, meaning whatever happens is compiler/architecture/OS dependent. So you can't depend on UB in portable code. But something definite will happen, given the current state of the system. After all, if it didn't, these things wouldn't be exploitable either.

link

tialaramex 1415 days ago

> But something definite will happen, given the current state of the system.

This is only true in the very loose and more or less useless sense that the compiler is definitely going to emit some machine code. What does that machine code do in the UB case? It might be absolutely anything.

One direction you could go here is you insist that surely the machine code has a defined meaning for all possible machine states, but that's involving a lot of state you aren't aware of as the programmer, and it's certainly nothing you can plan for or anticipate so it's essentially the same thing as "anything can happen".

Another is you could say, no, I'm sure the compiler is obliged to put out specific machine code, and you'd just be wrong about that, Undefined Behaviour is distinct from Unspecified Behaviour or merely Platform Dependant behaviour.

Many C and C++ programmers have the mistaken expectation that if their program is incorrect it can't do anything really crazy, like if I never launch_missiles() surely the program can't just launch_missiles() because I made a tiny mistake that created Undefined Behaviour? Yes, it can, and in some cases it absolutely will do that.

link

protomolecule 1415 days ago

What you are describing is unspecified and implementation-defined behavior [0].

Avoiding UB (edit: in general) doesn't have anything to do with the code being portable and everything with the code not being buggy [1][2].

[0] https://en.cppreference.com/w/c/language/behavior

[1] https://blog.regehr.org/archives/213

[2] http://blog.llvm.org/2011/05/what-every-c-programmer-should-...

link

Dylan16807 1415 days ago

What is undefined about a large VLA? It shouldn't be undefined.

According to wikipedia "C11 does not explicitly name a size-limit for VLAs"

link

saagarjha 1415 days ago

The C standard has no mentions of a program stack. This isn’t undefined behavior.

link

chjj 1415 days ago

You shouldn't be writing C if you're not a careful coder.

link

pjmlp 1415 days ago

Yeah, right.

https://msrc-blog.microsoft.com/2019/07/16/a-proactive-appro...

https://research.google/pubs/pub46800/

https://support.apple.com/guide/security/memory-safe-iboot-i...

Maybe you could give an helping hand to Microsoft, Apple and Google, they are in need of carefull C coders.

link

chjj 1415 days ago

I'm not sure if you intentionally missed my point. Everything in C requires careful usage. VLAs aren't special: they're just yet another feature which must be used carefully, if used at all.

Personally, I don't use them, but I don't find "they're unsafe" to be a convincing reason for why they shouldn't be included in the already-unsafe language. Saying they're unnecessary might be a better reason.

link

pjmlp 1415 days ago

The goal should be to reduce the amount of sharp edges, not increase them even further.

link

_0w8t 1415 days ago

VLAs are unsafe in the worst kind of way as it is not possible to query when it is safe to use them. alloca() at least in theory can return null stack overflow, but there is no such provision with VLA.

link

samatman 1415 days ago

And if you're a careful coder writing C, you should give the VLA the stink eye unless it's proving its worth.

link

kllrnohj 1415 days ago

Hint, that means nobody should be writing C.

link

krallja 1415 days ago

Where is the lie?

link

marcosdumay 1415 days ago

Too bad we have all that legacy C code that won't just reappear by itself on a safer language.

That means there are a lot of not careful enough developers (AKA, human ones) that will write a lot of C just because they need some change here or there.

link

einpoklum 1415 days ago

With VLAs:

1. The stack-smashing pattern is simple, straightforward and sure to be used often. Other ways to smash the stack require some more "effort"...

2. It's not just _you_ who can smash the stack. It's the fact that anyone who calls your function will smash the stack if they pass some large numeric value.

link

saagarjha 1415 days ago

They can overflow the stack. They cannot smash the stack.

link

einpoklum 1414 days ago

Fair enough; I had the mistaken idea that the two terms are interchangeable, but apparently stack smashing is only used for the attack involving the stack:

https://en.wikipedia.org/wiki/Stack_buffer_overflow

so, pretend I said "overflow" instead of "smash" in my post.

link

rhexs 1415 days ago

Useless semantic pedantry at best, but arguable wrong as there isn't some sort of ISO standard on dumb hacking terms.

link

saagarjha 1415 days ago

Overflowing the stack gives you a segfault. Smashing the stack lets hackers pop a shell on your computer. They are incredibly different. VLAs can crash your program, but they do not give attackers the ability to scribble all over the stack.

link

bjourne 1414 days ago

> Overflowing the stack gives you a segfault.

Maybe. If the architecture supports protected memory and the compiler has placed an appropriately sized guard page below the stack. If it doesn't then overflowing the stack via a VLA gives you easy read and write access to any byte in program memory.

link

pjmlp 1414 days ago

Unless they happen to be enjoying kernel space.

link

pjmlp 1415 days ago

What about not adding even more ways how we should avoid using C?

link

arinlen 1415 days ago

> What about not adding even more ways how we should avoid using C?

That's a mute point for C's target audience because they already understand that they need to be mindful of what the language does.

link

mjcohen 1415 days ago

What the heck. It's "moot", not "mute".

link

wizofaus 1415 days ago

I'm curious, are there accents in which those two words are homophones? Given the US tendency to pronounce new/due/tune as noo/doo/toon I can imagine some might say mute as moot but I can't find anything authoritative online.

link

jcranmer 1415 days ago

According to Wikipedia, East Anglia does universal yod-dropping, so mute/moot would be homophonic. (See https://en.wiktionary.org/wiki/Appendix:English_dialect-depe...).

Personally, I haven't come across anyone who pronounces 'mute' without the /j/.

link

danuker 1415 days ago

They are not perfect homophones. There is a slight i (IPA j) in "mute".

https://en.wiktionary.org/wiki/mute

https://en.wiktionary.org/wiki/moot

link

pjmlp 1415 days ago

That is like saying if sushi knifes are already sharp enough, there is no issue cutting fish with a samurai sword instead, except at least with the knife maybe the damage isn't as bad.

link

samatman 1415 days ago

The difference between the largest sushi knives and a katana is more about who wields them than the blade involved.

link

pjmlp 1415 days ago

One ends up cutting quite a few pieces either way.

link

naasking 1414 days ago

When you really need a samurai, nothing less will do. Arguably most of us need sushi chefs these days.

link

arinlen 1415 days ago

> That is like saying if sushi knifes are already sharp enough (...)

No, it's like saying that professional people understand the need to learn what their tools of the trade do beyond random stackoverflow search on how to print text to stdout.

It seems you have an irrational dislike of C. That's perfectly ok. No need to come up with excuses though.

link

pjmlp 1415 days ago

It doesn't seem, I do.

Ever since I got my hands on Turbo C++ 1.0, back in 1993, I see no reason why one should downgrade ourselves to C.

At least C++ give us the tools to be a bit more secure, even if tainted with C's copy-paste compatibility.

You will find posts from me on Usenet, stading on C++ frontline of C vs C++ flamewars.

No one is making excuses, it should be nuked, unfortunely it will outlive all of us.

link

Koshkin 1415 days ago

I like your comparison of a C programmer with a samurai.

link

phibz 1415 days ago

It's more like the C programmer is a sushi master. They can make a delicious, beautifully crafted snack. But if the wrong ingredients are used you'll get very sick.

link

pjmlp 1415 days ago

Including that most of them end up doing Seppuku on their applications.

link

saagarjha 1415 days ago

VLAs are no more unsafe than standard C is for stack corruption.

link

pjmlp 1414 days ago

Just one additional attack vector more to add to the list, who's still counting them?

link

saagarjha 1414 days ago

It’s not an additional attack vector.

link

tstanisl 1414 days ago

    int A[100000000];

Also has no protection.

link