Hacker News new | ask | show | jobs
by mjevans 3410 days ago
I don't think #2 has been fully beaten to death yet.

Assuming a platform where you don't segfault (say that 'page 0' variables are valid) and thus runtime does proceed; I still can't think of any /valid/ reason to eliminate the if that follows (focus line 2 in the comments).

Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

In my opinion that is an, often working but, incorrect optimization.

5 comments

C programmers expect dead code removal. Especially when the compiler also inlines functions (and, of course, inlining makes the biggest impact on short functions; and one way to get short functions is to have aggressive dead code removal). And macros can expand into very weird, but valid, code; so the statement that "nobody would ever write code like that" isn't relevant. The compiler may well have to handle unnatural looking code.

As others have stated, compilers generally don't actually have special case code to create unintuitive behavior if it looks like the programmer goofed.

It's possible and desirable for a compiler to remove branches of "if" statements that it knows at compile time won't ever be true. And, of course, one special case of statically known "if" statements are checks for NULL or not-NULL pointers in cases where the compiler knows that a pointer will never be NULL (e.g., it points to the stack) or will always be NULL (e.g., it was initialized to NULL and passed to a function or macro).

So the standard allows the compiler to say "this pointer cannot be NULL at this point because it was already dereferenced." Either the compiler is right because the pointer couldn't be NULL, or dereferencing the pointer already triggered undefined behavior, in which case unexpected behavior is perfectly acceptable. Some programmers will complain because the compiler won't act sensibly in this case, but C doesn't have any sensible option for what the compiler should do when you dereference a NULL pointer (yes, your operating system may give you a SEGFAULT, but the rules are written by a committee that can't guarantee that there will be an operating system).

OK, I think I see what SHOULD happen here.

It's been forever since I've actually compiled a C program not part of some package I was installing, BUT.

* There should be a warning flag, and it should be ON BY DEFAULT, that //all// code removal is stated, and the logic behind it given.

* It should be possible to elevate that condition to an error (and that too might even be the default).

I have no idea if it's true, but compiler implementers swear that their compilers perform so many optimizations, and that those optimizations allow additional optimizations, that this kind of approach would bury you in messages.
C programmers should be able to expect that "optimizations" will not transform program meaning. And because C is so low level, certain types of optimizations may be more difficult or impossible. If the pointer was explicitly set to NULL, the compiler can justifiably deduce the branch will not be taken but the deduction "if the programmer dereferenced the pointer it must not be NULL" is not based on a sound rule. In fact, the whole concept that the compiler can make any transformation it wants in the presence of UB is wacky. Optimization should always be secondary to correctness.
> C programmers should be able to expect that "optimizations" will not transform program meaning.

If x is null, the program in #2 has no meaning. The only way to preserve its meaning is to assume x is not null.

> Optimization should always be secondary to correctness.

If x is null, the program in #2 has no correctness. The only way to salvage its correctness is to assume x is not null.

Exactly. You are assuming that a poorly written standard is correct and engineering practice of working programs is incorrect.
Ok, so you want compilers to generate a translation for program #2 that works "correctly" to your mind when x is null.

Please explain to the class the meaning of the construct int y = *x; when x is null, so that all conforming C compilers can be updated to generate code for this case "correctly".

Something like

    assert(x);
    int y = *x;
is probably closer to the intended meaning. Checking if(!x) afterwards is still dead code, but at least the program is guaranteed to fail in a defined manner.

Of course, if implemented this way, every dereference would have to be paired with an assert call, bringing the performance down to the level of Java. (While bringing the memory safety up to the level of Java.)

> In fact, the whole concept that the compiler can make any transformation it wants in the presence of UB is wacky.

That's the way it's often explained but it's not really what happens--the compiler doesn't scan for undefined behavior and then replace it with random operations. Instead, it's applying a series of transformations that preserve the program's semantics if the program stays "in bounds", avoiding invoking undefined behaviors.

I agree that equating "if the programmer dereferenced the pointer" and "the pointer must not be NULL" betrays a....touching naivety about the quality of a lot of code, but if you start from the premise that the programmer shouldn't be doing that, the results aren't totally insane.

"touching naivete"is a good way of putting it.
> C programmers should be able to expect that "optimizations" will not transform program meaning.

That's the official rule, but it's "program meaning as defined by the standard." It's not perfect, but nobody's come up with a better alternative. We get bugs because programmers expect some meaning that's not in the standard. But compilers are written according to the standard, not according to some folklore about what reasonable or experienced programmers expect.

*

Again, the idea isn't that the compiler found a mistake and will do its best to make you regret it. Derefencing a pointer is a clear statement that the programmer believes the pointer isn't NULL. The standard allows the compiler to believe that statement. Partly because the language doesn't define what to do if the statement is false.

> But compilers are written according to the standard.

Written to the writers _interpretation_ of the standard. I bet money that every compiler written from a text standard hasn't followed said standard. It would be nice if a standard included code fragments used to show/test the validity of what is stated.

There are examples in the standard.
Actually that's not correct. The standard says the behavior is up to the compiler. The compiler author took that as a license to produce a non truth preserving transformation of the code. The actual current clang behavior also satisfies the standard as written.
> The standard says the behavior is up to the compiler.

I think this statement is correct, but it's the kind of thing people say when they confuse implementation defined behavior and undefined behavior. And that distinction is key.

Implementation defined behavior means the compiler gets to choose what it will do, document the choice, and then stick to it.

Undefined behavior means that the program is invalid, but the compiler isn't expected to notice the error. Whatever the compiler spits out is acceptable by definition. The compiler can generate a program that doesn't follow the rules of C; or that only does something weird when undefined behavior is triggered, but the weird behavior doesn't take place on the same line as the undefined behavior; etc.

It's certainly true that "the compiler isn't expected to notice the error" doesn't prohibit a compiler from noticing the error. A compiler can notice, but it's standard conforming even if it doesn't.

I should probably mention that when I say "the standard" I mean the C language standard; the POSIX standard may add requirements such that something undefined according to the language is well defined on POSIX.

So the standard does not require the compiler to e.g. remove the "redundant" check for null or assume that signed integers don't cycle values on overflow, but _permits_ the compiler to do so. Thus, we have two problems: a poorly thought out standard which permits dangerous compiler behavior and poorly thought out compiler that jumps at the chance.
> Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

You're conflating null and zero (which C encourages you to do for various terrible reasons). The test does not test that x is not zero; it tests that x is not null (null, like zero, is falsey, but again, null is not to be mistaken for zero), which in C is sometimes represented by the character '0' but which legally can be totally distinct from the bit-pattern zero and which should be thought of as totally distinct. Zero can be a valid address in memory. Null is never a valid address in memory. The integral type with value zero when cast to a pointer is guaranteed to be the null pointer (which may have a different bit pattern than zero!). Casting non-integral types that happen to have the value zero to a pointer is not guaranteed to produce a null pointer. Confused yet?

The compiler isn't. It knows that you're testing that a pointer is not null.

Since x has already been derefrenced, and since derefrencing x has no translatable meaning if x is null, it follows that we can only produce a meaningful translation of this program iff x is not null.

It therefore follows that x must not be null in the test, since x has not changed.

I'm sorry, but that is not what the current standard, no matter its weaknesses, requires. The standard says that dereferencing a null pointer results in undefined behavior. The compiler writer chose to (a) deduce that the pointer dereference was not an oversight and (b) then produce a pointless "optimization" that compounded the error. Compiling the dereference to a move instruction is perfectly compatible with the standard.
> Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?

Simple: undefined behavior makes all physically possible behaviors permissible.

In reality though, such an elimination would only be correct if the compiler was able to prove that the function is ever called with NULL, and if the compiler is smart enough to do that, hopefully the compiler writers are not A-holes and will warn instead of playing silly-buggers.

This is a kind of Department of Motor Vehicles Bureaucrat thinking. For example, there as many thousands of lines of C code that reference *0, which is a perfectly good address in some environments. One should be able to depend on compilers following expressed intentions of the programmer and not making silly deductions based on counter-factual assumptions.
> This is a kind of Department of Motor Vehicles Bureaucrat thinking.

Sorry, but modern compilers are basically automatic theorem provers. They'll use whatever means necessary to get every last drop of performance. If you play cowboy with them you'll just get hurt.

> For example, there as many thousands of lines of C code that reference *0, which is a perfectly good address in some environments.

It's permissible for a particular platform to define behaviors that the standard has left undefined. If you try to take that code and run it elsewhere, that's your problem.

If you are going to do theorem proving, you should try to only prove true theorems.
If you want to phrase it like that, a compiler tries to prove conjectures and performs actions (usually code and data elimination) based on whether it can prove them or their negatives. Sometimes it can't do either.

I don't see where you're going, though.

It's easy to see the problem in the null pointer case. The compiler deduction is that the null test is redundant, but it's not actually redundant. Therefore the compiler "proves" a false theorem. That the standards rules permit the compiler to deduce false things would be, in normal engineering analysis, considered to show a failure in the rules, but the bureaucratic mindset holds that the rules are always, by definition, correct, so the failure is the fault of the miscreant who neglected to staple cover sheet properly.

If the compiler is unable to prove a transformation preserves correctness, it should not do the transformation.

To your point below: The compiler is definitely not "forced" to assume that the pointer is not null - that is a choice made by the compiler writers. Even the ridiculous standard does not require the compiler writer to make that assumption. The compiler can simply compile the code as written - or, if it is smart enough to see the problem - it can produce a warning.

This battle has been fought and lost. If you require sensible behaviour, just move on and use a language that offers it. C compilers will do what makes them look good on benchmarks, and various "friendly C" efforts have been tried and failed.
Au contraire - gcc and clang both appear to do the right thing now.
You can't though. It's always possible to re-link the objects.
You can if, for example, the function is static. Also there could be a link-time optimization pass. The linker can see all calls to that function, unless the function is exported.
You're thinking about it in the context of actual computers. The C standard says absolutely nothing about what NULL has to be, besides that the integer value 0 is considered to be the NULL address and that dereferencing it is considered invalid. The NULL address does not have to be all 0 bits. Architectures are generally free to define it to any invalid address they want to be NULL, 0 just happens to be a common and easy one. The catch you're pointing out is that on x86 there are technically no 'invalid' addresses, so we just use 0 and assume you won't ever attempt to use the stuff there (Which in practice on x86, nobody puts anything there).
>The catch you're pointing out is that on x86 there are technically no 'invalid' addresses

Depending on what you mean precisely by "x86," there is such a thing as an invalid address: the IA32e architecture (or whatever you want to call Intel's flavour of 64-bit "x86") requires that the <n> high bits of an address match, where <n> is machine-dependent.

That's a fair point - though I think you could debate whether or not the current 48-bit address space of x86-64 is part of the architecture or just an implementation detail. But in the end I don't really think it matters which you consider it (And I'd be happy to consider it to be either). All that said, you're completely right that with current x86-64, there are 64-bit values that could never be a valid address.

  Under what set of logic does being able to de-reference a pointer confer that it's value is not 0 (which is what the test equates to)?
Normal deductive logic?

  * No NULL pointer can be dereferenced. 
  * x is dereferenced.
  * Therefore, x is not a NULL pointer.
Of course, the compiler is presuming that your code is correct. That's a reasonable presumption when dealing with computer programming languages. Programming languages would be rather hard to interpret and translate--not to mention impossible to optimize--if you couldn't apply basic deductive logic to their statements.

Imagine the routine had this code, instead:

  void foo (int *x) {
    if (*x != *x) {
    {
      return;
    }
    bar();
    return;
  }
wouldn't you expect the compiler to apply the same optimizations? Or would you be upset that eliding the check broke some code that depended on a race condition somewhere else in your program?

Also, pointing out that the "value is not 0 (which is what the test equates to)" is a non-sequitur. During compilation the literal 0 can behave as a NULL pointer constant. But the machine representation of a NULL pointer does not need to be all-bits 0, and such machine still exist today. And usually, as in this case, the distinction is irrelevant. It doesn't matter that the 0th page is mappable on your hardware. What matters is that the C specification says that a NULL pointer cannot be dereferenced; that dereferencing a NULL pointer is non-sense code.

There's an argument that compilers should be careful about the optimizations they make. Not all programs are correct, and taking that presumption too far can be detrimental. But it's not always trivial to implement an optimizing compiler to "do what I say, not what I mean". Optimizations depend on the soundness of being able to apply deductive logic to a program--that is, being able to string together a series of simple predicates to reach a conclusion about program behavior. You often have to add _more_ complexity to a compiler to _not_ optimize certain syntactic constructs. Recognizing the larger construct, especially only the subset that are pathological, without optimizing the ones everybody expects to actually be optimized, can be more difficult than simply applying a series of very basic deductive rules. So it's no wonder that most compiler implementations, especially high-performance compilers, tend to push back on this front.

What would be nice is for compilers to attempt to generate diagnostics when they elide code like that. An optimizer needs to be 100% correct all the time, every time. A diagnostic can be wrong some amount of time, which means it's easier to implement and the implementation of a particular check doesn't ripple through the entire code base.

GCC and clang implement many good diagnostics. But with -Wall -Wextra they also tend to generate alot of noise. Nothing is more annoying than GCC or clang complaining about perfectly compliant code for which there's no chance of it hiding a bug. For example, I used to often wrote initializer macros like:

  #define OPTIONS_INIT(...) { .foo = 1, .bar = 3, __VA_ARGS__ }
  struct options {
    int foo;
    int bar;
  };
allowing applications to have code like:

  struct options opts = OPTIONS_INIT(.bar = 0);
But with -Wall GCC and clang will complain about the second .bar definition overriding the first. (Because the macro expands to { .foo = 1, .bar = 3, .bar = 0 }). The C specification guarantees in no unambiguous terms that the last definition of .bar wins. And presumably they guarantee that precisely to make writing such macros feasible. I've never once had a problem with unintentionally redefine a struct field in an initializer list. Yet GCC and clang are adamant about complaining. It's so annoying especially because 1) there's absolutely nothing wrong with the code and 2) disabling the warning requires a different flag for clang than for GCC.

(I realize that for such option types you usually want define the semantics so that the default, most common value is 0. But it's not always desirable, and certainly not always practical, to be able to stick that mode. And that's just one example of that construct.)

* No NULL pointer can be dereferenced.

NULL pointers CAN be dereferenced, it all depends on the environment you run on.

In standard C NULL pointers cannot be dereferenced. Full stop, there is nothing to argue about it.

There are environments that either lack memory protection and do not allow invalid pointer derereferences to be caught, which means that you can't rely on the MMU to catch mistakes. In this case either deal with silent errors or get a compiler that is able (at a cost) to sanitize any pointer access.

There are other systems in which memory address 0 is a perfectly valid address. The ABI of these systems should pick a different bit pattern for NULL pointers, but often don't so compilers sometime offer as a conforming extension extension the option to allow null pointers to be treated as valid (in effect not having null pointers at all).

There are indeed such systems.

Embedded systems (which are frequently programmed in assembly or 'C' code).

Such systems very often map a small bit of high-speed (on chip) RAM to the first few bytes of address space. I very distinctly recall such an embedded system in a college course.