Hacker News new | ask | show | jobs
by Reefersleep 762 days ago
How about a separate, schema-wise identical "deleted_x" table that you "move" deleted entities to? Can't get much more explicit than that, and still enables whatever joins you'd like on historical deleted data.
5 comments

It's an approach I've seen before that can work nicely - often when you want to retain records for auditing/compliance purposes that refer to a deleted entity.

But I'd usually consider soft delete alongside this approach, as it always really depends on what you're doing and what your needs are - if you constantly query the dependent records joined to the entity you may or may not delete, then a deleted entity table means you now need to left join two tables when before you could inner join one table. So soft delete might be simpler.

But if that's a rare use case, then soft delete might be more complex depending on how many separate codepaths are querying the primary entity.

My next blog post should be called "It depends - avoiding the overly broad generalisations anti-pattern".

If you're using foreign keys, that becomes quite tricky. Not impossible, but you're dealing with way more than a single record being moved and may need to duplicate some references between live and deleted data.
> Can't get much more explicit than that

If you want to preserve history (not just the special case of deletion) you'd also need to move 'updated' entities as well.

The article isn't just pointing out that a 'deleted' column is a hassle, it's also pointing out it's insufficient for preserving history.

These options are ways to preserve entities not histories.

For example you might not care to record the 20 different names/birthdays a user changed but you might care to remember that the user existed.

That was my thought as well, I believe it's close to the "let the data warehouse sort it out" solution: make the current state separate from the "history" - in their example this supposes a data warehouse, but it could just be separate tables or databases.
There is the downside of having to maintain both schemas now.

Unless you automate it devs will have to remember to migrate both when making a change which adds some overhead, not a lot, but it's just something to consider here imo as some migrations (schema and/or data) can become nasty and complex