Hacker News new | ask | show | jobs
by gpderetta 1306 days ago
But again, the compiler did not find UB through static analysis. The compiler inferred that the pointer could not be null and removed a redundant check.

For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

1 comments

The compiler made a dangerous assumption that the standard permits ("the author surely has guaranteed, through means I can't analyze, that this pointer will never be null").

Then it encountered evidence explicitly contradicting that assumption (a meaningless null check), and it handled it not by changing its assumption, but by quietly removing the evidence.

> For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

If it can infer it from actually good evidence, sure. But using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence that only the C standard could come up with.

> using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence

Do you think the compiler would be right to remove the second check here?

   if (!x) std::abort();
   if (!x) return;
   ... = *x;
What about changing std::abort with the following?

   [[noreturn]] void my_abort();
How's that different form a check after dereferencing a pointer? In both cases the check can be removed because dataflow or control flow analysis.

What if my_abort returns instead? Or another thread changes x after the fact?

If I had written the above code, I had clearly done something wrong. I would not want the compiler to remove the second check. I'd want it to (at the very least) warn me about an unreachable return statement, so that I could remove the actual meaningless code.

It's been long enough since I wrote C that I'm not familiar with that noreturn syntax or the contract I guess it implies, but control flow analysis which can prove the code will never be run, should all ideally warn me about it so that I can remove it in the source code, not quietly remove it from the object code.

I'm not demanding that it should happen in every case, but the cases where it's undecidable whether a statement is reachable or not, obviously it's undecidable for purposes of optimizing away the statement too.

The first check might be in a completely different function in another module (for example a postcondition check before a return). Removing dead code is completely normal and desirable, warning every time it happens would be completely pointless and wrong.

   if (!x) std::abort();
   if (!x) return;
In this case, the compiler should warn that the second statement will never be executed, instead of just silently removing it.

   int *x = libX_foo();
   if (!x) {
      return;
   }
   ...
libX_foo from libX gets at some point updated to abort if the return value would be null. After interprocedural analysis (possibly during LTO) the compiler infers that the if statement is redundant.

Should the compiler complain? Should you remove the check?

Consider that libX_foo returning not-null might not be part of the contract and just an implementation detail of this version.

> Should the compiler complain? Should you remove the check?

Yes and yes.

> Consider that libX_foo returning not-null might not be part of the contract and just an implementation detail of this version.

How is it an “implementation detail” whether a procedure can return null? That's always an important part of its interface.

> How is it an “implementation detail” whether a procedure can return null? That's always an important part of its interface.

In gpderetta's example, the interface contract for that function says "it can return null" (which is why the calling code has to check for null). The implementation for this particular version of the libX code, however, never returns null. That is, when the calling code is linked together with that particular version of the libX interface, and the compiler can see both the caller and the implementation (due to link-time optimization or similar), it can remove the null check in the caller. But it shouldn't complain, because the null check is correct, and will be used when the program is linked with a different version of the libX code which happens to be able to return null.

For a more concrete example: libX_foo is a function which does some calculations, and allocates temporary memory for these calculations, and this temporary allocation can fail. A later version of libX_foo changes the code so it no longer needs a temporary memory allocation, so it no longer can fail.

And LTO is not even necessary. It could be an inline function defined in a header coming from libX (this kind of thing is very common in C++ with template-heavy code). The program still cannot assume a particular version of libX, so it still needs the null check, even though in some versions of libX the compiler will remove it.

The contract is that libX_foo can return null. But a specific implementation might not. Now you need to remove the caller side check to shut up the compiler which will leave you exposed to a future update making full use of the contract.

Also consider code that call libX_foo via a pointer. After specialization the compiler might see that the check is redundant, but you can't remove the check because the function might still be called with other function pointers making full use of the contract.