Hacker News new | ask | show | jobs
by brudgers 650 days ago
A consensus standard happens by multiple stakeholders sitting down and agreeing on what everyone will do the same way. And agreeing one what they won't all do the same way. The things they agree to doing differently don't become part of the standard.

With compilers, different companies usually do things differently. That was the case with C87. The things they talked about but could not or would not agree to do the same way are listed as undefined behaviors. The things everyone agreed to do the same way are the standard.

The consensus process reflects stakeholder interests. Stakeholders can afford to rewrite some parts of their compilers to comply with the standards and cannot afford to rewrite other parts to comply with the standards because their customers rely on the existing implementation and/or because of core design decisions.

3 comments

the main stakeholders are c programmers and the users of their programs, not c compiler vendors. the stake held by c compiler vendors is quite small by comparison. however, the c standards committee consists entirely of c compiler vendors, as you implicitly acknowledge by referring to 'their compilers' and 'their customers'. this largely happens through the same process through which drug regulations are written by drug companies and energy regulations are written by oil companies: the c compiler vendors have much deeper knowledge of the subject matter; the standard is put into practice by what the c compiler vendors choose to do; and, although the c compiler vendors' interests in the c standard are vastly less significant than those of c programmers and users of c programs, they are also vastly more concentrated

consequently, the consensus process systematically and reproducibly fails to reflect stakeholder interests

Specifically, undefined behavior is when the compiler vendors couldn't agree whether a particular bit of code should legitimately compile to something or be considered erroneous. Ex.: null pointer access. Clearly an error in user-space programs running on a sophisticated operating system, but in kernel or embedded code sometimes you do want to read or write to memory location 0. So the standards committee just shrugged and said "it's undefined". Could be an error, could not be. It depends on your compiler, OS, and environment. Check your local docs for details.
Behaviour that merely differs based on implementation is either unspecified behaviour or implementation-defined behaviour. That and undefined behaviour are different things in the C++ standard.
I think for implementation-defined behavior the code has to do something sensible, but the standard doesn't specify what; the distinction for undefined behavior is that it could be erroneous (meaning it triggers an exception or just goes completely bonkers) but it could also do something sensible and expected, again depending on environment.
C++ is a separate language with a separate standard.
It’s also cases where some compilation targets could, for example, raise an interrupt on signed overflow, and that kind of behavior would be completely out of the scope of the C standard, because it would be highly hardware-specific.
i think the issue is more that in embedded code you can't depend on the hardware to detect an access to any given memory location (the standard does permit the bit pattern of a null pointer to be different from all zeroes, which is what you are supposed to do if you want memory location 0 to be referenceable with a pointer)
As I recall the standard also mandates that ((void *)0) is a null pointer, even if it gets converted behind the scenes to some other bit pattern that represents null for that architecture. So it's all a wash.
the standard does mandate that, yes. as i understand it, you could, for example, take the bitwise not of a pointer value to be its intptr_t value, and then use the all-ones bit pattern for your null pointer. probably a lot of existing c programs would fail to work on such an implementation (because they assume that memset with 0 will create null pointers, for example), but permitting such things was an intentional feature of the standard

usually there is some memory address that you can sacrifice for null pointers

No compiler I'm aware of implements non-zero null pointers on systems where address 0 is valid (e.g. armv7), so it ends up being kind of a moot point.
yeah, i'm not aware of any instances of it!
> Stakeholders can afford to rewrite some parts of their compilers to comply with the standards and cannot afford to rewrite other parts to comply with the standards because their customers rely on the existing implementation and/or because of core design decisions.

I was nodding along until here. Wouldn’t one, given the option, always choose, if possible, a compiler that doesn’t differ from the standard? And if that isn’t an option, wouldn’t it be up to said stakeholders to own the inconsistency?

Tough problem to solve for sure.