Hacker News new | ask | show | jobs
by rawgabbit 523 days ago
Nulls are not necessarily distinct.

I believe this confusion is confusing the tool with the thing being measured. For simplicity, I will use the analogy of a record (stored as a row in the database) as an observation in a scientific experiment. If the tool was able to record a value, I enter a value like 579.13. If the tool was not able to record a value, the tool will enter NULL. I make a total of one hundred observations. Of one hundred rows, some have values and some are NULL.

Are NULLs distinct values? No, they are simply a failure in measurement; it is like asking if all errors are distinct or the same. Are NULLS part of the same dataset? Yes, because they are all observations for the same scientific experiment. What does it mean when "select distinct ... " returns several rows for known/measurable values and but only one row for NULL? If this is confusing, the scientist can update the rows and substitute "UNKNOWN/ERROR" for every NULL. When you do "select distinct ...", you will get the same thing. It will return several rows for known/measurable values and but only one row for "UNKNOWN/ERROR".