Just to clarify, I'm not advocating to introduce a new `unknown` keyword. I'm saying that the existing `null` in SQL was not named properly and that the name `unknown` would have been more fitting. SQL's `null` already has the semantics of `unknown` as explained in the part of the article that I quoted.
SQL NULL is not "exactly equivalent" to unknown. E.g. in an outer join, there's nothing unknown about the result that is missing a row from one side, yet SQL semantics is to fill it with nulls.
In practice, it behaves as "unknown" in some contexts, as "missing value" in other contexts, and sometimes it's just plain WTF like SUM() returning NULL rather than 0 if there are no rows.
The term "null" comes from C. A. R. Hoare, who has proposed in November 1965 various additions to the programming language ALGOL 60, including pointers (called references by Hoare, "pointer" comes from IBM PL/I, in July 1966), and including a special value "null" for references a.k.a. pointers, for denoting unknown or not applicable values.
C. A. R. Hoare, 1965-11:
"In order to enable references to represent partial functional relationships, i.e. ones which do not necessarily yield a value, a special reference value null is introduced.
This value fails to refer to a record, and any attempt to use it to refer to a record leads to an undefined result."
In his proposal, Hoare has followed the COBOL 60 terminology for some of the additions to ALGOL, i.e. "record" instead of the word "structure", introduced by IBM PL/I in 1964-12, and "record class" instead of "structure type", hence the "class" of SIMULA 67, from where the word "class" has spread into all OOP languages.
In Hoare's proposal, references a.k.a. pointers could point only to values belonging to user-defined types, i.e. records a.k.a. structures, not to primitive types.
I wouldn't necessarily define `null` as "unknown" -- it's just "no value" -- which is really the same thing and also somewhat equivalent to "unset". But null pointers aren't unset as pointers aren't initialized to null in C and you can explicitly set a pointer to null.
E.F. Codd added nulls to relational model in 1970 so that does pre-date C. The concept is even older than that I imagine.
In nth normal form, you can't have 'no value'. That would mean your model is wrong. In academic relational data books, null does mean "unknown". There is a value, we just don't know what it is (yet).
If there might actually not be such a value, you're supposed to change your schema to reflect that.
> There is a value, we just don't know what it is (yet).
In all my years, I've never used null for that. If I don't have a value yet then generally I'm not writing any part of the record. I only ever use nulls for the absence of a value. Creating a new table for every potential optional column is the "academic relational way" and also insane. :)
What happens if your data is produced by some automated process such as a sensor reading and occasionally the sensor fails to return a value? NULL seems exactly the appropriate value to use.
You'd also have to ask when NULL came into common use in C (to which I do not know the answer). AFAIK NULL was not considered to be standard until C89. As far as I'm aware, all C compilers implement it as a #define in a standard header somewhere; it's not actually part of the core language itself.
I wonder who first added that macro? Was it there from the beginning?