Hacker News new | ask | show | jobs
by thatjoeoverthr 1387 days ago
Your database is is the state of your system. Guard it!

I just ran into severe data corruption at a large client because a programmer four years ago wrote an empty catch block. The system would open a transaction, hit a fault, roll back, then continue writing to the database as if it’s still in the context of the transaction.

I spent some time trying to pin down exactly what it did, and found that many writes went through because of a missing foreign key constraint.

In short: if a particular table of minor importance had a foreign key constraint, there would have been no damage whatsoever, because it would have faulted immediately after the rollback.

You can’t rig up a constraint against every dumb write. But you can rig them up against some of the dumb writes. And sometimes that’s enough.

4 comments

You're exactly right. This is the crux of it. In many domains (not every domain, but every one I've ever worked in) you can delete/rewrite/change the app code without much fuss but the data is vital.

For that reason having constraints enforced by the system that stores the data, external to the app code which developers will inevitably mess up, is so useful and important.

So many issues in my experience have been similar to the one you describe. If the right constraints had been present so much work and so many headaches could have been avoided.

I feel like people who claim they're not needed or not important or "the app code will do it" need to be wrangled into maintenance work of old systems for a year or two until they repent. Rather than boshing out an ill-considered prototype and then moving on before the bugs are discovered.

Key statement: integrity of the state is much more important than some inconveniences here and there
I'd go so far as to say that integrity of state is a requirement to build robust software. If your state is ill-defined, it's pretty much impossible to write software that behaves correctly.

Sure, you can guard against some forms of bad data and fix it as it comes in (or abort with errors if your system can handle that), but in order to do anything interesting, you need to make assumptions about the data. the only operation you can perform without assumptions is the identity function, and most software requires much more than that.

Data that's malformed at captured time is often indistinguishable from outright data loss.

Since it just gets worse the longer you accept corrupted data, this is a good justification for "crash early" programming, like DB enforcement of data integrity.

a few weeks ago I had a discussion with a developer with 20+ years experience about why you shouldn't squelch DB errors and try to continue running.

Nothing is a hard and fast rule, but in this case the only way for an error to occur is if the query schema differed from the code select statement (column names changed). At that point wtf are you doing trying to keep running without errors, something is fundamentally mismatched between your application and the data.