Let's say we're in a situation like this: create table people (
name text primary key,
age integer
);
insert into people (name, age) values ('Bob', 50), ('Alice', null), ('Mallory', null);
We know Bob is 50, but we don't know the age of Alice or Mallory.Are Bob and Alice the same age? We don't know. Are Alice and Mallory the same age? We don't know. If `null` were equal to `null,` we couldn't express missing information like this. Naive queries would still have bugs and conclude that Alice and Mallory were the same age. Null is a property of living in an uncertain universe. When you start taking measurements of the real world (which every production business process is), you run into the many faces of null. For example, I used to take data in a lab. Often we'd measure three replica samples. Sometimes I'd spill one of them. Scientific ethics demand I still write down the two measurements I did collect, even if I go get three more samples. For the third, I'd write the pen and paper equivalent of null (striking out that cell in the table). Were any two of those missing measurements equal? Does a dog have Buddha nature? |