Hacker News new | ask | show | jobs
by Animats 646 days ago
Well, where have we had trouble in C in the past? Usually, with de-referencing null pointers. The classic is

   char* p = 0;
   char c = *p;
   if (p) {
      ...
   }
Some compilers will observe that de-referencing p implies that P is non-null. Therefore, the test for (p) is unnecessary and can optimized out. The if-clause is then executed unconditionally, leading to trouble.

The program is wrong. On some hardware, you can't de-reference address 0 and the program will abort at "*p". But many machines (i.e. x86) let you de-reference 0 without a trap. This one has caught the Linux kernel devs at least once.

From a compiler point of view, inferring that some pointers are valid is useful as an optimization. C lacks a notation for non-null pointers. In theory, C++ references should never be null, but there are some people who think they're cool and force a null into a reference.

Rust, of course, has

    Option<&Foo>
with unambiguous semantics. This is often implemented with a zero pointer indicating None, but the user doesn't see that.

So, what else? Use after free? In C++, the compiler knows that "delete" should make the memory go away. But that doesn't kill the variable in that scope. It's still possible to reference a gone object. This is common in some old C code, where something is accessed after "free". This is Common Security Weakness #414.[1]

Not a problem in Rust, or any GC language.

Over-optimization in benchmarks can be amusing.

   for (i=0; i<100000000; i++) {}
will be removed by many compilers today. If the loop body is identical every time, it might only be done once. This is usually not a cause of bad program behavior. The program isn't wrong, just pointless.

What else is a legit problem?

[1] https://cwe.mitre.org/data/definitions/416.html

2 comments

I can't see what is 'undefined' here. I would expect the program to read the first byte of memory and test if it is 0 or not. If I was writing this in assembly for an MCU, I would write exactly the same code in the target instructions.

There may be many environments where this would be invalid, but why would the compiler optimise this out based on, say, the operating system, if it is valid code?

> This is often implemented with a zero pointer indicating None, but the user doesn't see that.

The Guaranteed Niche Optimisation is, as its name suggests, guaranteed by the Rust language. That is, Option<&T> is guaranteed to be the same size as &T. The choice for the niche to be the all-zero bit representation is in some sense arbitrary but I believe it is a written promise too.