Hacker News new | ask | show | jobs
by jmyeet 530 days ago
NULL is the absence of a value. If you try and treat it as a value, you're going to have a bad time. So an attempted UNIQUE(email_address, deleted_at) constraint is fundamentally flawed. If you treated NULL as a value that could be unique, you're going to break foreign keys.

But let's continue the logic of deleted_at being NULL indicating an active account, which seems to the intent here. You end up doing things like:

    SELECT /* ... */
    FROM accounts
    WHERE email_address = '...'
    AND deleted_at IS NOT NULL
Depending on your database, that may or may not index well. More problematic, you may end up with privacy leaks if someone forgets the last conditional.

If anything, you want to reverse this so someone has to go out of their way to explicitly select deleted accounts. There are multiple strategies for this eg using an active_accounts view or table.

Lastly, there are lots of potential reasons for an account to be disabled or otherwise not visible/accessible. Takedowns, court orders, site safety, hacked accounts and so on.

Overloading deleted_at to have a semantic meaning for an active account is just fundamentally bad design.

3 comments

> Overloading deleted_at to have a semantic meaning for an active account is just fundamentally bad design.

Then don't do that. It's kind of a leap to say soft deletes are categorically bad because someone might confuse "deleted" with "inactive". My users table does the super-advanced thing of having both columns. The ORM also doesn't forget to add the not-null criterion. There's also zero databases in active use where it poses a problem to indexing.

Soft deletes suck in their own way, but none of the alternatives are perfect either.

You put the "is not null" on the index itself and then simply don't use it for the much rarer queries that are on deleted accounts. Or just use a view for active accounts.

Overloading timestamps to carry a boolean on null is awesome as long as you decide that's what you're doing and use one of the several standard techniques to dodge the easily avoided potential downside.

This isn't a valid security concern, more than any other incorrect sql query would be anyway. A dev can always write a bad Q, you need another way to address that it's not more likely here because of the null.

That's interesting - I believe this is exactly how Sequelize implements soft-deletion.