|
|
|
|
|
by grogers
2230 days ago
|
|
This is exactly what MySQL (and I think Oracle) does. The final delete of a row remains tombstoned in the main table until a background process determines it is safe to remove based on transaction visibility (MySQL calls this purge). It seems the important difference is whether the table needs to be scanned to find dead rows. In Postgres it seems the answer is yes. In MySQL it keeps this info as a separate scratchpad to quickly find the rows to remove once they are allowed to be. Both approaches have advantages and disadvantages. For example, in MySQL, if you have a very long running transaction such that it can't purge rows for a long time, you can actually overflow the rollback segment onto disk and purging slows down dramatically. You might even need to pause writes to get it to recover, although I haven't ever seen it get that bad. With a table scan approach it degrades more gracefully in that scenario. |
|
As you noted, this can get painful since purge effectively gets blocked -- if you have a high-volume OLTP workload with heavy emphasis on quick small UPDATEs / DELETEs, and then introduce a single very long-running read transaction (such as a logical dump/backup), purge can't make progress. But this is just due to the current implementation. In theory a smarter system could differentiate between row versions actually needed by old transactions vs ones that are inaccessible to all transactions.
[1] https://bugs.mysql.com/bug.php?id=74919