| This might stem from the domain I work in (banking), but I have the opposite take. Soft delete pros to me: * It's obvious from the schema: If there's a `deleted_at` column, I know how to query the table correctly (vs thinking rows aren't DELETEd, or knowing where to look in another table) * One way to do things: Analytics queries, admin pages, it all can look at the same set of data, vs having separate handling for historical data. * DELETEs are likely fairly rare by volume for many use cases * I haven't found soft-deleted rows to be a big performance issue. Intuitively this should be true, since queries should be O log(N) * Undoing is really easy, because all the relationships stay in place, vs data already being moved elsewhere (In practice, I haven't found much need for this kind of undo). In most cases, I've really enjoyed going even further and making rows fully immutable, using a new row to handle updates. This makes it really easy to reference historical data. If I was doing the logging approach described in the article, I'd use database triggers that keep a copy of every INSERT/UPDATE/DELETEd row in a duplicate table. This way it all stays in the same database—easy to query and replicate elsewhere. |
All your other points make sense, given this assumption.
I've seen tables where 50%-70% were soft-deleted, and it did affect the performance noticeably.
> Undoing is really easy
Depends on whether undoing even happens, and whether the act of deletion and undeletion require audit records anyway.
In short, there are cases when soft-deletion works well, and is a good approach. In other cases it does not, and is not. Analysis is needed before adopting it.