Hacker News new | ask | show | jobs
by inopinatus 3234 days ago
The more experienced a developer I become, the more strongly (and negatively) I feel about nils and nulls and their ilk. I have sympathy for C.A.R.Hoare who in 2009 apologised for the apparent invention of null references in ALGOL W (1965), calling them a "billion-dollar mistake". I've come to regard them as a data singularity, and when I design data structures and interfaces today I am deliberately avoiding/outlawing them; all my relational fields are NOT NULL and I choose either meaningful defaults, or EAV or equivalents instead; in method parameters I would rather something not exist than for it to accept a null reference or value. And I believe that the resulting code is more modular, more easily refactored and more reusable a result, errors are better handled, and the resulting data structures and calling arguments more easily interpreted, more readily queried and destructured, and are (so far) proving generally better fitted to real-world domains.
3 comments

Funny, I'm the opposite. The more experienced I've become, the more I've found that nil-punning is ultimately what I actually wanted.

And I'm all for the idea that relational fields should be NOT NULL. I also fear that this doesn't really work for backwards compatible thinking. If I serialized some data down to disk before a field existed, I don't expect it to be there when I check it later.

You can be tempted to think it should just be the zero value of the type you are using. Or you can add some extra boilerplate around accessing. I think either works. Just make sure you aren't getting carried away. And, try to do anything that cares about the absence or presence of something at a layer from where you get that something. Don't punt the decision down your codebase.

(That is, Optionals are great at the layer, don't pass them as parameters to inner code, though. Obviously, YMMV. And, quite frankly, probably will go further than mine.)

Agree completely. Google removed "required" fields from proto3 because they cause problems for compatibility and version skew. And even in proto2, which had "required" fields, people quickly learned to avoid them. Anything that goes on the disk or wire should have only "optional" and "repeated" fields (as a bonus, "optional" is encoded the same as "repeated" with zero or one values).
I'm all for the idea that relational fields should be NOT NULL

What if the data is actually missing? How else do you record that information?

You use the default empty value, and have an extra field for missingness. Than you have real type safety.
Real type safety is sum types. If I need to express something that is present or missing, I should use a Maybe monad.

Having an extra field for "missingness" is less safe because the type system won't enforce that it is either missing or set, you could have it set to a value but marked as missing which is still ambiguous.

That's a "cure" worse than the disease.
If missing data is a valid value, pick a valid way to encode it. Null might work, but realize you could have to reason for the value. Actually missing, or just not collected it recorded.

I concede there may be no difference in those meanings.

My problem with null is that it doesn't nest. For example, if I do a DB search for a particular column of a particular row and get null, does that mean the row doesn't exist, or it does but the column is empty? With optionals, you can distinguished between this with e.g. `Nothing` vs `Just Nothing`.
In databases NULL does exist; it is an explicit statement of having no contained value. (There is a container here, the contents were not specified. A distinct statement from /knowing/ the contents to be empty. (zero, zero-length string, etc))

Conceptually NULL or nil is an appropriate concept for results that have no meaning, such as if an error occurred or if a passed value is not required or valid. (Though some structures can contain data that is 'incomplete' or 'not checked' and thus while a valid structure might not be 'validated' in the sense of conforming to a more specific set of expectations.)

kind of like having

field_is_set = true, field = "123"

field_is_set = false, field = 0 (some zero value or uninitialized value)

Sort of... and it seems to me that Go wants you to think this way about values within a struct (i.e. the "zero" values).

But isn't that a really clunky way of checking whether field is set? You can't just check field because the "zero value" could be a legit value, e.g. zero. So you have to first check field_is_set -- and now you have to make sure that's always correct and that nobody ever sets field by itself.

Or worse having to inspect a specific field value (or worse compare the whole thing to a reference object) to determine if the result is actually valid or not.

The question I ask in these circumstances: Will a method/function //always// return valid work if the program continues to run?

How about a 'find' function of some type? Find the nth thing, find matches of X, etc.

That's one type of function that might return no answer.

If a list or set of some sort is expected I'm happy with a zero-length list in this case. However lists aren't the only time this happens. The most recent example to come to my mind is finding the Nth item in an arbitrary sequence. That item might be out of bounds (not exist). Nil is appropriate for that case.

Can't that be determined by looking at the row count? A row count of zero means it doesn't exist. A row count of 1 with a null in the selected column means the row exists the column is null.

I avoid the word "empty" when referring to anything SQL related, as it is ambiguous in three value logic.

SQL syntax, at least what I'm familiar with offhand, makes you explicitly say things such as WHERE (a.cola = b.colb OR a.cola IS NULL OR b.colb IS NULL) or similar syntax but with distinct variations for joining 'left' and 'right' tables on an expression (which, BTW, can be noticeably slower than the WHERE version, depending on which database you're using).
I think they're referring to outer-joined tables.

SELECT a.id, b.name FROM a LEFT JOIN b ON a.id = b.id

If you get a NULL in the name field, you don't know if that's because there's no record in b for that id, or if there is a record in b for that id but it has a NULL name value. Sometimes that difference will be important.

While admittedly this could be seen as a mistake in SQL, you can differentiate by looking at whether b.id is NULL or not.
true, but that's not the point the OP was making, I think :)
NOT NULL bugs me too, but not so much because nulls are possible, more so that I think it should be inverted since that's the common case (at least for me).
Right! It should have been called NULLABLE and default to false.