Hacker News new | ask | show | jobs
by makomk 3897 days ago
The problem is that it's often perfectly clear, reasonable code on all the systems it was intended to run on. For example, on all Unix-like systems, pointer arithmetic is simply arithmetic and behaves like it. (C's predecessor didn't even have separate pointer and integer types.) So prior to compiler optimisations, this series of operations is safe and well-behaved on all architectures Linux supports even if a is NULL:

  int *b = &a->something; // pointer arithmetic, doesn't dereference a.
  if(a == NULL) return 0;
  else something_critical = a->somethingelse;
However, some non-Unix address models that Linux doesn't support don't permit pointer arithmetic on NULL pointers. So the ANSI C standards committee declared it undefined. Which means that gcc can - and eventually did - eliminate the NULL pointer check. This has resulted in privilege escalation vulnerabilities in Linux that didn't exist until gcc decided to optimise the code, some of them quite well-hidden.
1 comments

I understand the problem, I'm saying that it's not GCC's problem. If you don't want undefined behavior, don't put undefined behavior in your code. The code you wrote isn't clear or reasonable, because it relies undefined behavior. It's a valid criticism that this code does appear to be straightforward when it isn't, but that's not a criticism of GCC, it's a criticism of ANSI C. If you don't like it, use a better language. C was designed 4 decades ago; and they can't possibly have forseen every problem that we've discovered in that time.
ANSI C didn't really do anything wrong here, though - they created a least-common-denominator spec of what you could reasonably expect from C across all platforms. Pointer arithmetic on NULL pointers had to be considered undefined (not just unspecified) in ANSI C, because on certain commercially-important proprietary systems it generated a hardware trap that caused the OS to kill your process. The problem is that the gcc developers insisted on actually making that code behave as undefined even though it didn't make sense to.

Also, I should note that a lot of code - particularly the Linux kernel - isn't actually using ANSI C anyway. They're using a superset of it with gcc extensions and they have a whole bunch of architecture-specific code too.

> If you don't want undefined behavior, don't put undefined behavior in your code.

I'd quip that this is statistically impossible for a sufficiently large codebase.

> it's a criticism of ANSI C. If you don't like it, use a better language.

This is my basic stance. However, if I'm e.g. in a situation where I have a C or C++ codebase I can't afford to rewrite from scratch, I'd like to use a "Better C" compiler, where "Better C" is a slightly less bad version of "ANSI C" - some undefined behavior removed, for example.

As shorthand, I'll generally refer to compilers for "Better C" as "Good C Compilers".

GCC is not trying to be a Good C Compiler. They've decided these things aren't their problem. Which is... fair. That's their choice. I do not for one minute pretend to understand that choice however - and it gives me yet one more reason to switch to a Good C Compiler.

Good luck with that. I suspect the only good C is not C.
I don't disagree - but there's value in harm reduction, no?