Hacker News new | ask | show | jobs
by CJefferson 4815 days ago
How many problems are actually caused by null (in particular, how many billions of dollars?)

While pointers which point into the wrong place for various reasons (off end of array, previously freed memory) cause horrible issues to this day, I can't personally remember ever having a serious issue with a null pointer (they tend to crash quickly and loudly, because in all modern OSes dereferencing NULL segfaults)

5 comments

The problem isn't just NULL in C. This post is talking about the entire Null/Nil reference problem across all languages that use a null-type value.

This is especially a problem in dynamic languages that sling nils around....like any major modern scripting language. Checking if a value is nil before proceeding is aping what a language like Haskell does when it pattern matches against Maybe (Just a, Nothing) albeit in a post-facto bad way. Granted, you can't really make any assertions about reflecting a maybe value in the type of a language that doesn't care about types before runtime.

That's the main problem with calling Maybe 'better'. If you don't have static type checking then either you implement it just for Maybe or you end up with the same sorts of problems as you had before.

Making types non-Nullable by default is nice, though. Even in a dynamic language you can have a syntactic distinction to make interfaces more explicit.

True. I think it's also a covert argument for static type checking, though ;-).
Give me a statically typed Python and a type system that can handle external hardware poking around in its memory and I'm in ;)

I'm sort of a fan of Haskell already, to tell you the truth.

> Give me a statically typed Python

Not sure that's so far from Haskell, really... :-P

> and a type system that can handle external hardware poking around in its memory

Can it be in isolated places? That's doable...

Not sure that's so far from Haskell, really... :-P

Maybe it's my lack of experience talking. Haskell feels heavier in a lot of places. I think the focus on compilation in the backend is almost a downside here, too.

Can it be in isolated places? That's doable...

Depends on the application. There's a lot of hardware out there that does really weird stuff. There's some interesting work in Haskell-space (eg Atom) but I don't know how comprehensively mature it is.

I actually started designing statically typed Python, since it seemed a fun exercise. Halfway in I realised I was effectively redesigning Haskell with slightly different syntax and stopped.
> they tend to crash quickly and loudly

In some scenarios, this is a serious issue all by itself. My day-to-day work is mostly on Android, and eliminating nullable references altogether would eliminate some crashes, which are highly visible to the user.

I've worked with a code base in C++ where the code base would accumulate significant amounts of

    if (argumentX == null)
        return null;
at the top of function signatures. It was just defensive programming. Maybe argumentX couldn't be null, but it would take time to figure that out (sometimes I did that, though). More code means harder to read and maintain, thus costing dollars.

This would also be contagious: If a piece of code checks if X is null, you'll assume that X can be null, whether or not that's true.

I'd certainly prefer to be able to reason about the code with the safe assumption that certain things cannot be null.

I'd certainly prefer to be able to reason about the code with the safe assumption that certain things cannot be null.

You can do this, but if you want it to be maintainable, you'll also want to detail in the function comments this technical debt. If you don't, someone else will come along and see your sweet method (looking only at the comments) and use it where the input can be null.

Ex:

  /**
  * This does some stuff.
  * @param entry Does something with this
  * DEBT: Assumes the input entry is not null.
  */
  void doSomething(SomeObject entry) { }
Why just put it in the documentation? Documentation is liable to drift from implementation, and AFAIK no compiler or runtime verifies the accuracy of comments. I'd feel much better about adding asserts to the original code, leaving it for a few generations of testing and exposure, and then eventually remove the conditionals. The assert calls then function as executable documentation.

Depending on your user-base (i.e. if you distribute headers to other developers with precompiled code), the documentation may be necessary on its own... but it's much weaker than an assert.

Java has supported @Nullable and @Nonnull for a while, it's pretty standard in good Java code these days, and the IDE's will perform static analysis to make sure you use these annotations consistently (e.g. warn you if you are checking a @Nonnull against null or if you are forgetting to test a @Nullable against null before dereferencing it).
Is that C++? Just use a reference if you want a pointer which asserts it can't be null.
It's Java ;P I haven't used C++ in a long time. But you are correct a reference does exactly what he wants.

For anyone who needs to look it up like me: http://en.wikipedia.org/wiki/Reference_(C%2B%2B)

Except the reference can itself be null, which is still problematic.
Why do you consider this a debt? It's pretty much the only sane thing to do.

If people want to pass lazy NullPointerExceptions, it's their fault.

I've worked on a lot of legacy Java code where these kind of null checks cause problems. The end up confusing things in corner cases.

You should verify arguments at a top level, then let the code underneath blow up with a NullPointerException if something unexpected happened. Your stack trace will point at where the problem lies.

I find for some reason that a lot of people want to use null instead of empty lists where you'll end up with this ugliness:

    if (list != null) {
        for (Thing t: list) {
            processThing(t);
        }
    }
Instead of just passing in a Collections.emptyList() instead.
I agree with one exception: if you're storing the value passed down into a structure which will survive the call, then the value should be checked for null before being placed in the structure. This is because the eventual null pointer exception may occur long after it was put in the structure, obscuring how it got there.
Even if it can't be null now the calling code may change in the future. And did you check all the possible error conditions (e.g. failed allocation). If you are relying on a non-null call best to check every time.
I don't know about it in terms of money loss, but generally having missed checks caught at compile time rather than having the program crash is a good thing.
Not to mention lost time spent hunting for a hidden nil value.
There used to be a whole class of Linux vulns involving mmap()ing memory at virtual address 0x0, filling it with fake kernel data structures containing some data value val and some pointer ptr and triggering NULL dereference in kernel code which was known to parse this structure and copy val to address pointed by ptr.

They had to "fix" it by blocking userspace memory mappings at the 0th page.