Hacker News new | ask | show | jobs
by turboponyy 1487 days ago
You should disallow nulls as much as possible, but "" is just as valid of a string value as "John Smith"; if the field is actually semantically nullable, reflect that in the type - don't use arbitrary "blank" values to denote nulls.
1 comments

A string being empty oftentimes isn’t semantically different from the value being absent. I’d argue that, this being the case, the type should also be able to reflect whether the string can be empty or not (or all-whitespace or not, etc.).
I completely agree, but conventional database systems don't support dependent typing unfortunately. Given the constraints, the best you can do in this situation whilst remaining semantically consistent in the type signatures is to leave the field nullable and enforce value checks (e.g. can't be blank).
Database systems support constraint checks. Given that SQL isn’t statically typed, that’s almost the best you can do. The only lack is that you usually can’t bind constraints to a user-defined type name, so you have to repeat the constraint definition for each relevant column/table. But on the programming language side, dependent types (or something equivalent) would be appropriate, and I expect will become common practice at some point in the future.
I think we may have been talking past each other - I did mean SQL's CHECK() et al with "value checks". I agree with everything you've said.
Right, I missed “remaining semantically consistent in the type signatures”. Assuming you mean correspondence between the DDL schema definition and the PL type signature, I see no need to do so, as long as both can be generated from the same definition, or one from the other. That is, the scenario were dependent types are used on the PL side and constraint checks on the database side seems perfectly fine to me (given the limitations of SQL). I’m assuming that you meant value checks on the PL side if you didn’t mean checks on the DB side.
By "type definition" I did actually mean the database schema. I don't particularly care what happens at the boundaries, and I think translating a database schema into suitable types at the programming language level makes sense (e.g. dependent types). I was just describing how it would make most sense in my mind to design the database schema in isolation with other concerns in order to remain most semantically correct.
This is almost always true but somehow often overlooked. There's almost always no reason to rely on "null" to mean "empty". A carefully chosen default value often does the job.