| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by Aardwolf 34 days ago

What's the reason that C didn't define the order of this?

The horrible undefined behavior of signed integer overflow at least can be explained by the fact that multiple CPU architectures handling those differently existed (though the fact that C even 'attracts' its ill-defined signed integers when you're using unsigned ones by returning a signed int when left shifting an uint16_t by an uint16_t for example is not as forgivable imho)

But this here is something that could be completely defined at the language level, there's nothing CPU dependent here, they could have simply stated in the language specification that e.g. the order of execution of statements is from left to right (and/or other rules like post increment happens after the full statement is finished for example, my point is not whether the rule I type here is complete enough or not but that the language designers could have made it completely defined).

10 comments

jcranmer 34 days ago

The short answer is because C was designed to give leeway to really dumb compilers on really diverse hardware.

This isn't quite the same case, but it's a good illustration of the effect: on gcc, if you have an expression f(a(), b()), the order that a and b get evaluated is [1] dependent on the architecture and calling-convention of f. If the calling convention wants you to push arguments from right to left, then b is evaluated first; otherwise, a is evaluated first. If you evaluate arguments in the right order, then after calling the function, you can immediately push the argument on the stack; in the wrong order, the result is now a live variable that needs to be carried over another function call, which is a couple more instructions. I don't have a specific example for increment/decrement instructions, but considering extremely register-poor machines and hardware instruction support for increment/decrement addressing modes, it's not hard to imagine that there are similar cases where forcing the compiler to insert the increment at the 'wrong' point is similarly expensive.

Now, with modern compilers using cross-architecture IRs as their main avenue of optimization, the benefit from this kind of flexibility is very limited, especially since the penalties on modern architectures for the 'wrong' order of things can be reduced to nothing with a bit more cleverness. But compiler developers tend to be loath to change observable behavior, and the standards committee unwilling to mandate that compiler developers have to modify their code, so the fact that some compilers have chosen to implement it in different manners means it's going to remain that way essentially forever. If you were making a new language from scratch, you could easily mandate a particular order of evaluation, and I imagine that every new language in the past several decades has in fact done that.

[1] Or at least was 20 years ago, when I was asked to look into this. GCC may have changed since then.