| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by davidverhasselt 1880 days ago

There is a downside which I've experienced: if you want a triple-state boolean (null, false, true) then having a boolean column allows for that while a timestamp-as-boolean column does not (you lose the "null" value because that equals `false` in timestamp-as-boolean).

Having a distinction between `null` and `false` can be handy for values that are optional or have a dynamic default. If it's `null` you know it is not explicitly set and could use a fallback value. If it's false you know it's explicitly set as false.

A simple use-case for this is when a user can leave the field blank. This is impossible to model with only a timestamp-as-boolean.

Another use-case is dynamic defaults or fallbacks, e.g. `hidden` of a folder where if `hidden` is nil, you fall back to the parent folder's value.

TL;DR a boolean column actually has 3 states, a timestamp only has 2. Article makes a big deal about there not being any nuance about the fact that a timestamp is superior. I disagree, because you go from 3 states to 2 states, there are cases where you'd want a boolean instead of a timestamp. Ironically OP missed this nuance (or they'll pull a no-true-scotsman).

16 comments

alpaca128 1880 days ago

What you describe calls for an Enum, not a nullable boolean which is just another way of passing a hardcoded magic value carrying an implicit meaning.

> a boolean column actually has 3 states, a timestamp only has 2.

Going with your logic a timestamp has billions of states, you just have to arbitrarily assign special meanings to certain dates that won't ever be used. Just like using null as another state I wouldn't call it a good idea, though.

hallway_monitor 1880 days ago

Your statement is false. Many use cases call for optional bools. The great thing is there is no implicit meaning - it's True, False or Unset. It is true that nulls can be abused but in this case, far more elegant than an enum.

alpaca128 1880 days ago

There is no way for you to make sure that `null` was "set" intentionally and not due to a bug or other kind of failure.

Meanwhile if you see a `None` value you know for a fact that it was set by your software and if you actually encounter a `null` you know that something went horribly wrong.

Strong type systems and a few overheads in favor of better bug detection/prevention are popular for a reason.

samhh 1880 days ago

There are many application states in which `null` could mean "not got the data yet" and/or "there is no data".

Just use an enum, it's much more expressive.

Dah00n 1880 days ago

Could you supply an example that an enum wouldn't do as well? How is a bool null, false, true different than an enum with null, false, true?

Edit: grammar

samhh 1879 days ago

Logically they're the same, they're isomorphic. It's just about ergonomics, readability, etc.

ThinkBeat 1880 days ago

If you go with nullable Boolean in the database schema, do you give it a type of nullable boolean in the code?

In which case null would either mean the value on the recordin the database is null, or that that for whatever reasons that parameter was not read from the database.

You will need two types of null.

All of this goes away with an enum or similar solution.

You can decompose the nullable boolean in the database when you read it , but again your task would be easier to treat it like an enum.

swaggyBoatswain 1877 days ago

I use nulls or undefined in javascript as the initial state for most values for this reason, it's explicit in that it's unset

Ditto for database design too

andix 1880 days ago

Tri-state booleans are also very ugly. What happens if you suddenly need a 4th state?

Use enums (or any equivalent) for states that are non-boolean.

jimktrains2 1880 days ago

Null is a signifier that a value isn't know. Yes, if you're using null to mean something else you should use another type, but nullable-booleans (and nullable fields in general) can be extremely useful. For instance, if you're collecting information, but wont know everything at creation time, the unknown values are null.

chriswarbo 1880 days ago

> For instance, if you're collecting information, but wont know everything at creation time, the unknown values are null.

I find the idea of 'creating unknown values' to be self-contradictory. I find it much more logical to define a PartialFoo containing only the parts we know up-front, and a fill in the rest later using a function PartialFoo -> Foo.

If we need multiple steps, we can put some Optional fields in the PartialFoo, to avoid lots of intermediate types. This is still annoying, but it keeps all of the 'intialisation headaches' separate from the Foo type itself; so code dealing with Foo doesn't have to care about null checks, missing fields, etc.

jimktrains2 1880 days ago

> If we need multiple steps, we can put some Optional fields in the PartialFoo, to avoid lots of intermediate types.

You juat described the purpose of SQL's NULL.

hallway_monitor 1880 days ago

If you are using a nullable bool, you do not need a fourth state. True, False, Unset. Very useful in a lot of cases. It's actually not ugly but very elegant.

m_mueller 1880 days ago

I don't think it's elegant at all. Especially in SQL dialects where you need to check null state separately from false. So much bug potential...

Spivak 1880 days ago

With an enum you’d have to check the unset state separate as well.

crocarneiro 1880 days ago

He meant that in some SQL Dialects you cannot check FIELD = NULL, you have to use FIELD IS (NOT) NULL (like Oracle). Therefore you cannot use FIELD inside an IN clause either.

andix 1879 days ago

But with an (NOT NULL) enum/integer/string or however you are representing it, every state is a regular case. Not a special case. Null nearly always needs some special treatment.

cdirkx 1880 days ago

The obvious solution is to encode it like this: `null` is `null, `yes` is the timestamp when it was set and `false` is a timestamp somewhere in the future. Now t < now() maps nicely to the value of your boolean.

You just have to pick your `false` timestamp somewhere far into the future, let's say something arbitrary like 03:14:07 on Tuesday, 19 January 2038. The software won't be around for that long anyway, so it will never be a problem...

davidverhasselt 1880 days ago

Hehe, that's creative. But I'd say - if you're making it that complex, just use a boolean column and a timestamp column separately instead, if you really need the timestamp, and make it explicit and simple to understand.

josho 1880 days ago

And this is why legacy code is such a nightmare. Compound unnecessary optimizations like this over a decade and it’s impossible to understand the data in the DB or how to safely refactor the code without breaking some weird corner case that was handled by using magic values instead of structured data.

gpvos 1880 days ago

Better use something so far in the past that the software certainly didn't exist yet, like 3 January 1970. In fact, you could use all of, say, the '70s to encode lots of flags... gets excited

Supermancho 1880 days ago

Store your timestamps in a long int or string.

Positive = timestamp

-1 = false

0 = null

thirdlamp 1880 days ago

I can’t tell if you’re being ironic or not

cdirkx 1880 days ago

Yeah it's mostly a reference to Y2K and Y2038 [1], using representations that seem clever and work now, but will lead to bugs in the future because nobody thinks their software will be around for that long.

[1] https://en.wikipedia.org/wiki/Year_2038_problem

> The latest time since 1 January 1970 that can be stored using a signed 32-bit integer is 03:14:07 on Tuesday, 19 January 2038

> MySQL database's built-in functions like UNIX_TIMESTAMP() will return 0 after 03:14:07 UTC on 19 January 2038

notatoad 1880 days ago

OP missed that nuance, and so will a lot of other developers you work with. Relying on a difference between null and false is one of those things that seems clever when you first do it, but then three years later your project is scattered with comments like "// be very careful checking this, it must be false here null means something else" after you or some other dev you're working with confuses falsy for false

cratermoon 1880 days ago

I've been in software development for 27 years, and I've never seen a case where a boolean column or variable being null was anything but a terrible, confusing, fragile, hack.

Remember that in some languages the behavior of null is weird, and can be false, or can be treated as 0.

    $ node
    Welcome to Node.js v14.16.0.
    Type ".help" for more information.
    > null + null
    0
    >
    > 0 == false
    true
    > 0 === false
    false
    >

secondcoming 1880 days ago

What has the ridiculousness of Javascript got to do with a database schema?

cratermoon 1880 days ago

What good is a database schema if it doesn't have data and isn't used by any applications?

cutemonster 1880 days ago

There are languages with nice optional types, which maps nicely to SQL null. Not Javascript though and I wouldn't want to use it server side.

doodpants 1880 days ago

Long time readers of The Daily WTF know that the canonical tri-state boolean is { True, False, FileNotFound } [1]

[1] https://thedailywtf.com/articles/What_Is_Truth_0x3f_

timzaman 1880 days ago

"triple state boolean", what's next, a double precision fp32?

tuukkah 1880 days ago

Also known as *bool, "Maybe Bool", "Boolean?", "Optional<Boolean>" etc.

isbvhodnvemrwvn 1880 days ago

Well, in browsers there's `HTMLMediaElement.canPlayType()` which returns one of the following strings:

- "probably"

- "maybe"

- ""

Reference: https://developer.mozilla.org/en-US/docs/Web/API/HTMLMediaEl...

skytreader 1880 days ago

My goodness. I remember having to explain this to my manager. They really wanted recording + playback in our platform. For as many browsers as possible. Even mobile (iOS, I'm looking daggers at you). In 2015.

AFAIR I just said fuck it and made playback as permissive as possible (i.e., only prevent media playback if canPlayType returned ""). I don't know how advisable that is but the bug got off my back anyway.

I dunno what makes this so difficult, why we can't get at least a definite "yes" even in 2021.

PebblesRox 1880 days ago

That's like my washing machine which has a Normal setting, a Standard setting, a Wash setting...

themulticaster 1880 days ago

Do you mean as in "If the washing machine has a Wash setting, what does it do if it is not set to Wash?"

isbvhodnvemrwvn 1880 days ago

And what's the difference between normal and standard?

ExtraE 1880 days ago

Wat

watt 1880 days ago

yeah, but if you really need to store 3 states, why not make 1 first-class concept instead of awkwardly combining two?

Spivak 1880 days ago

Because a nullable bool has literally the same semantics as Maybe<Bool>. Why invent it again but in your app this time?

hibbelig 1880 days ago

Long Boolean! Yes, no, don’t know, weekdays only.

pezezin 1880 days ago

If boolean comes from George Boole, then maybe we should call triple state values "Łukasiewiczan", it would make for a funny typing exercise :P https://en.wikipedia.org/wiki/Jan_%C5%81ukasiewicz

chadlavi 1880 days ago

The use cases the blog post is discussing aren't tri-state bools though. Either the user did or did not sign in. Either the version was or was not published. Null and false are equivalent here.

BiteCode_dev 1880 days ago

If the timestamp is a creation date after your system went on, which it is here, you can always store 0 for false.

kijin 1880 days ago

-1 can be useful, too, especially if you've already assigned a certain meaning to 0.

Javascript developers are used to certain functions returning -1 if there's no match, so -1 shouldn't feel strange as long as it's well documented.

dorgo 1880 days ago

I wonder why they decided to use -1. Why not undefined or NaN or even null? The -1 produces bugs which go unnoticed for a long time.

themulticaster 1880 days ago

On the other hand, there's strcmp()/strncmp(). I can never wrap my head around the fact that strcmp(s1, s2) == 0 (or even better, !strcmp(s1, s2)) means s1 is equal to s2.

jeroenhd 1880 days ago

If you look at strcmp like "the difference when you compare the strings for sorting", it makes complete sense. If the first mismatch of strcmp(a, b) is smaller than 0, you put the a before b, otherwise you put b before a.

The fact that an equality check can be made by running a comparison function is useful, but that's not all the method does.

For other methods in much C code, a common mindset is that a method returning a value will return the error code, with error code details in errno. The error code for success is 0, which is fitting of course.

Had C implemented booleans, this problem would never have been a problem, because if(int) wouldn't have been a legal expression, but sadly booleans are implemented as integers in the language instead. I strongly dislike languages that do allow implicit casting from integers and such to a boolean, if(var!=0) is much more readable because of the explicit boolean expression.

johnchristopher 1880 days ago

But then it's empty string vs null value all over again.

Besides it's about an improvement to boolean, not about adding one more optional value (which likely lead to optional values and definitely out of the boolean field).

underwater 1880 days ago

I'm struggling to think of an example following the OP's example - discrete events that may have not occured yet - where you'd need to differentiate between a certain `false` and an uncertain `null`.

rrrrrrrrrrrryan 1880 days ago

A question like "Do you have any allergies?" probably requires ternary logic: yes/no/unset

Discrete events are usually more binary by nature: a thing either happened or it didn't.

That said, if it's possible for an event to un-happen, you're back in ternary-land: there's now a distinction between un-set and false which may be important to capture.

There's a reason why relational databases use ternary logic when most of the rest of the computing world uses binary logic.

You might argue that you could just create a brand new event, but now you've almost assuredly changed the grain of your table and goofed up the primary key. Your nice normalized table is now a dumb, non-performant endless event log: good luck with indexing that table and tuning those SELECT queries.

davidverhasselt 1880 days ago

The OP does not talk about events, OP talks about state and when it was set. There's no where in the article that OP restricts themselves to discrete events that may or may not have occured.

underwater 1880 days ago

I'm not understanding the distinction. The only thing you get out of this schema is knowing what happened (implicitly defined by the field), whether it's happened yet, and when it happened. That feels like an event to me.

Fields where there isn't an discrete event don't work. E.g. is_dog_owner can become adopted_dog_at, but is_dog_lover can't become loved_dogs_at.

I'd actually even argue that this is not storing state directly. You derive state from knowing an event has occured in the past: deleted_at (event) => is_deleted (state).

aabbcc1241 1878 days ago

Why not store "loved_dogs_at"? A potential use case could be to send some coupons to "new dog lover" within the last 2 weeks.

another-dave 1880 days ago

Well the article uses the example of swapping `is_published` for `published_at` — with a boolean you could have:

* NULL — never published (e.g. draft) * true — live now * false — previously live but explicitly unpublished

Which you miss if just a date. Similarly he talks about `is_signed_in` — NULL/true/false let's you model the case where a user has never signed in (e.g. an admin created your account but you've never used it) but NULL/timestamp missed this

goofballlogic 1880 days ago

In my case the null state was "not yet processed" i.e. a decision pending

joosters 1880 days ago

Perl got this right decades ago with its 'undefined' status for unset variables, so you can tell the difference between false and undef

maxlybbert 1880 days ago

That make me think more of Javascript, which distinguishes between undefined values (usually means “not set”) and null values.

You’re correct that a Perl scalar can always be set to undef, which is the Perl name for null. But that’s not really unique to Perl. For instance, while a Java boolean can’t be null, a Java Boolean can be.

jwr 1880 days ago

That depends on the language. In Clojure and ClojureScript, for example, distinguishing between nil and false is not a problem at all.

davidverhasselt 1880 days ago

I'd assume most languages don't have a problem with distinguishing between nil and false. The article explicitly maps nil to false, thus I don't see the relevance of your comment.

867-5309 1880 days ago

the initial state is going to be false before any user input. if the user leaves a field blank, then that still counts as input, just run a check such as 'if blank then null else timestamp' on the insert

gorkish 1880 days ago

0 for never set, timestamp for set time, negate the timestamp for unset time

Viliam1234 1880 days ago

dates between October 4th and 15th 1582 for special values

Aeolun 1880 days ago

You can use negative values for your alternative states.