Hacker News new | ask | show | jobs
by mattb314 3123 days ago
To be honest, I'm a little confused by your answer. Aren't high contention and deadlocking workloads essentially the same thing to a deadlock-detection db? I can imagine some patterns where that wouldn't necessarily be true, but they seem to go together in general. It also seems like your suggestion in the case of deadlocks is to modify your transactions to acquire locks in the same order, which is exactly what (many) avoidance systems do for you in the first place. What's the advantage of putting this burden on the user?
1 comments

> Aren't high contention and deadlocking workloads essentially the same thing to a deadlock-detection db?

High contention means that there are many concurrent transactions modifying the same rows. Deadlocking means transactions modify the same rows in opposite orders. The former definitely does not imply the latter. Most workloads are simple enough to never have any deadlocks.

> which is exactly what (many) avoidance systems do for you in the first place.

This might work for very simple database models which require the user to specify all the keys that are going to be modified in the transaction upfront (NoSQL). In PostgreSQL, a transaction can contain any number of statements in any order, results are returned at the end of each statement - before the next statement is known, and modifications can have complex, non-deterministic WHERE clauses.

The database has no way of knowing which rows will be affected by an UPDATE and DELETE in advance, and therefore it does not which row locks it needs to take in advance. Even if it first read and sort all the data, it still doesn't know what the next statement is going to be. The only way in which the database could avoid deadlocks is by taking very aggressive predicate locks on the table, which is obviously bad for concurrency. Another possibility is to use wound-wait, but this introduces retry delays when there is contention even when there are no deadlocks. Moreover, because results (e.g. number of rows modified) are returned after each statement, a transaction cannot restart from the beginning (since it might modify different rows the second time), so in practice wound-wait would have to unnecessarily abort transactions if you were using it to avoid deadlocks between multi-statement transactions.

It is generally much, much, much more efficient to just let UPDATEs run without predicate locks or retry delays. They will only get into an occasional deadlock if the user has transaction blocks that make changes in a different order. In those cases you can have deadlock detection save the day, and tell the user how to avoid running into it again.

> What's the advantage of putting this burden on the user?

If a transaction has multiple statements that modify the same rows in different order then the two choices are: reduce concurrency, or, detect deadlocks as they occur. The former is a severe pain that all transactions have to pay, the latter is an occasional, preventable pain that only deadlocking transactions pay. Clearly, the former would be the higher burden.