Hacker News new | ask | show | jobs
by pwdisswordfishz 292 days ago
Why bother with this category? The code exercising it is buggy either way, regardless of whether the behaviour is specified or not, and having a fixed definition may constrain implementations and negatively impact performance. And if that "erroneous" definition comes with stability guarantees, then you might not even bother calling it "erroneous" at all, because then it's going to be as toothless as declaring any given JavaScript syntax to be "improper": https://github.com/twbs/bootstrap/issues/3057#issuecomment-5...
2 comments

Ignoring performance considerations, which I'm happy to address separately if people care, the options were:

#1 Require initialization (like many modern languages). Makes sense. But now your existing C++ doesn't even compile so that's a hard "No" from the committee.

#2 Status quo, evaluating uninitialized variables is Undefined Behaviour. We cannot diagnose this reliably, any attempt will be Best Effort and several vendors already supply this but when it doesn't catch you arbitrary nonsense happens.

#3 Zero init. Now not initializing has defined behaviour, all the diagnostic tools we saw in #2 are invalidated and must be removed, but did you actually mean zero? Awful bugs still occur and now our best tools to solve them are crippled. Ouch.

#4 Erroneous Behaviour. Unlike #3 we do not invalidate those diagnostic tools from #2 because we've said the tool was correct. However, we do avoid Undefined Behaviour, something bad might happen but at least it's something you can reason about and it is clearly stated that it's your fault.

The distinction between #3 and #4 does not matter in practice: "I have learned to use variable initializers, that's why there isn't one present". As soon as you make it reliable, people will start to rely on it and it will become entrenched.
> Why bother with this category? The code exercising it is buggy either way

Because it is an actual security vulnerability if you cross privilege boundaries (infoleaks/(K)ASLR bypass, etc.), and one people often miss at that.

Say you write:

    struct { long long a; char b; } foo; foo.a = 0; foo.b = 1; return foo;
You end up leaking 7 stack bytes here (due to padding).

GCC's `-ftrivial-auto-var-init=pattern` currently initializes all unknown-value stack variables with 0xFEFEFEFE(...). This is usually an invalid fp value, invalid offset and invalid virtual address, allowing crashes to happen. This is a good thing.

Regarding performance, there is an attribute to opt out (both for the standard C++26 feature and the GCC option that is a subset of it)

How does it prevent security vulnerabilities when instead of being undefined entirely, the behaviour is defined to be wrong? This is the "chug along at all costs" mentality that PHP has been slowly and painfully growing out of.

`-ftrivial-auto-var-init=pattern` doesn't need "erroneous behaviour" in the standard at all. In fact, it may outright conflict with it, if for example the standard defines that the compiler must initialize variables to zero instead of your chosen pattern in case of "erroneous behaviour".

"Erroneous behaviour" is a superfluous concept that exists only to allow the committee to pat themselves on the back and say "See? We no longer have undefined behaviour!".