Hacker News new | ask | show | jobs
by jerf 1025 days ago
SQL NULL has to be the quintessential example of why programming can't just blindly follow math. The NULL truth table makes all the sense in the world in a mathematical context.. and it is a total pain to program with. Letting NULL == NULL be "true" and NULL == anything else be "false" might be slightly less principled, but it would be soooo much safer and have prevent so very many bugs.

(And it's not like that's some sort of mathematical nonsense either. It makes perfect sense to not create a special value that represent "outside my universe of values" and then refuse to equate any two of them. It makes perfect sense to declare a distinguished "outside my universe of values" and let it be equal to any other such value on the grounds that it is mathematically impossible to witness any differences between them anyhow so within the system they are indeed equal.)

1 comments

Let's say we're in a situation like this:

    create table people (
        name text primary key,
        age integer
    );
    insert into people (name, age) values ('Bob', 50), ('Alice', null), ('Mallory', null);
We know Bob is 50, but we don't know the age of Alice or Mallory.

Are Bob and Alice the same age? We don't know.

Are Alice and Mallory the same age? We don't know.

If `null` were equal to `null,` we couldn't express missing information like this. Naive queries would still have bugs and conclude that Alice and Mallory were the same age.

Null is a property of living in an uncertain universe. When you start taking measurements of the real world (which every production business process is), you run into the many faces of null.

For example, I used to take data in a lab. Often we'd measure three replica samples. Sometimes I'd spill one of them. Scientific ethics demand I still write down the two measurements I did collect, even if I go get three more samples. For the third, I'd write the pen and paper equivalent of null (striking out that cell in the table).

Were any two of those missing measurements equal? Does a dog have Buddha nature?

I think the mistake here was the name chosen. The NaN concept from floating point has an identical definition and behavior, so calling it something like NaV (not a value) or undefined (similar but not identical behavior in JS) instead might have caused less confusion vs NULL typically having a definition of an implicit Option<T>
I addressed all of that in advance of your posting. We live with the equivalent `null == null` in plenty of other contexts, and not only does it not blow up, it in fact works better and has fewer bugs. It isn't even as if math demands this; it is merely one choice and one that is observably a mistake.
I disagree that it's an arbitrary choice, it's asserting statements as true which we don't know anything about. I also disagree that there is such a distinction between a "mathematical context" and a "programming context" here - what we're talking about is formal logic (rather than proper math), and programming is an application of formal logic. Indeed, that's why we're here in the first place, to discuss how any() and all() arise from set theory and predicate logic.

I don't know what systems you're referring to, but like I mentioned you will get bugs if you naively use either approach. Eg in Rust, Option<T>::None will compare equal, because Rust doesn't have a null concept and that's how Rust enums work. But if you were using None to represent missing information, you'd want to use pattern matching rather than the built in equality. (Missing information is a much less salient concept in Rust than SQL, and Rust very deliberately wanted to avoid having a language-level null concept, so I think both of them made the right choices for their problem domains. Though I agree with this comment[1].)

But I'm getting the vibe that we're not going to convince one another, which is fine, this is an area where reasonable people disagree.

[1] https://news.ycombinator.com/item?id=37271039

> and that's how Rust enums work

It shows that I haven't written Rust in a few months, that's not how Rust enums work, that's how the derive macro for PartialEq/Eq works. D'oh!