Hacker News new | ask | show | jobs
by KronisLV 146 days ago
> I've seen tables where 50%-70% were soft-deleted, and it did affect the performance noticeably.

I think we largely need support for "soft deletes" to be baked into SQL or its dialects directly and treated as something transparent (selecting soft deleted rows = special case, regular selects skip those rows; support for changing regular DELETE statements into doing soft deletes under the hood).

https://news.ycombinator.com/item?id=43781109

https://news.ycombinator.com/item?id=41272903

And then make dynamically sharding data by deleted/not deleted really easy to configure.

You soft deleted a few rows? They get moved to another DB instance, an archive/bin of sorts. Normal queries wouldn't even consider it, only when you explicitly try to select soft deleted rows would it be reached out to.

2 comments

Well, Microsoft SQL Server has built-in Temporal Tables [1], which even take this one step further: they track all data changes, such that you can easily query them as if you were viewing them in the past. You can not only query deleted rows, but also the old versions of rows that have been updated.

(In my opinion, replicating this via a `validity tstzrange` column is also often a sane approach in PostgreSQL, although OP's blog post doesn't mention it.)

[1]: https://learn.microsoft.com/en-us/sql/relational-databases/t...

MariaDB has system-versioned tables, too, albeit a bit worse than MS SQL as you cannot configure how to store the history, so they're basically hidden away in the same table or some partition: https://mariadb.com/docs/server/reference/sql-structure/temp...

This has, at least with current MariaDB versions, the annoying property that you really cannot ever again modify the history without rewriting the whole table, which becomes a major pain in the ass if you ever need schema changes and history items block those.

Maria still has to find some proper balance here between change safety and developer experience.

> I think we largely need support for "soft deletes" to be baked into SQL

I think web and GUI programmers must stop expeting the database to contain the data already selected and formatted for their nice page.

> I think web and GUI programmers must stop expeting the database to contain the data already selected and formatted for their nice page.

So a widespread, common and valid practice shouldn't be made better supported and instead should rely on awkward hacks like "deleted_at" where sooner or later people or ORMs will forget about those semantics and will select the wrong thing? I don't think I agree. I also don't think that it has much to do with how or where you represent the data. Temporal tables already do something similar, just with slightly different semantics.

What way of making it better supported wouldn’t require custom semantics that people would forget and then select the wrong thing.
> custom semantics

Making those custom semantics (enabled at per-schema/per-table level) take over what was already there previously: DELETE doing soft-deletes by default and SELECT only selecting the records that aren't soft deleted, for example.

Then making the unintended behavior (for 90% of normal operational cases) require special commands, be it a new keyword like DELETE HARD or SELECT ALL, or query hints (special comments like /*+DELETE_HARD*/).

Maybe some day I'll find a database that's simple and hackable enough to build it for my own amusement.