Hacker News new | ask | show | jobs
by voidmain 2949 days ago
> it is enough to verify only that each transaction writing to any of those fields is also writing to that "sum" field and that it does uphold the invariant

This is a good try at a theory of when you can use weak isolation. But as stated, I believe it is false. If the invariant is A+B=C, I think this sequence is permitted by READ COMMITTED:

T1: read A=1

T2: read B=1

T2: write A=2

T2: write C=3

T2: commit

T1: write B=10

T1: write C=11

T1: commit

Now A=2, B=10, C=11

More to the point, invariants aren't always local to a row, and people don't usually even fully articulate them.

> There is no need to pay the huge performance price of serializing those transactions with all the rest of transactions in the system

Serializable execution can be made very very fast when there aren't actually a lot of conflicts. And then when there are you can do the hard work of trying to prove that something weaker will work. The failure mode in the other direction is silent corruption of your data.

1 comments

What i was talking about in its most simple (though not most performant/scalable as it would be more complicated, so we wouldn't go into it now) implementation would look like this - the transactions involved with an invariant should start with locking of the "target" field of the invariant, the field C in your example:

T1: lock C

T2: attempt to lock C - blocked until T1 completes.

I.e. we have these 2 transactions serialized between themselves without unnecessary extra serialization with all the rest of the transactions in the system. The "target" field just plays the role of the mutex associated with the invariant (and this is why any invariant can be converted to that form just by having a mutex, ie. a "target" field, associated with it). From the theory of multi-threaded programming with mutexes we just know that everything can be implemented correctly in such a model. I.e. any application can be made correctly with READ COMMITTED isolation if required set of "mutexes" and their acquire/release is correctly implemented.

>Serializable execution can be made very very fast when there aren't actually a lot of conflicts.

not exactly. Serializability by itself is pretty expensive global invariant and your application would be paying for it in performance and scalability even though it may not really need it.

> any application can be made correctly with READ COMMITTED isolation if required set of "mutexes" and their acquire/release is correctly implemented

This is true, but it's "a big if." And the result might be faster than a good implementation of serializable isolation (since the latter is working to maintain some invariants that you don't actually care about), but it could easily be much slower: very broad invariants will require very broad locks which cause much more "serialization" than is required for serializability.

Let's say you want the same sum invariant but across rows: all users' balances must sum to zero. The straightforward application of your reduction is to essentially have a global mutex for all balance transactions, but this is nowhere near the most scalable solution. Whereas a good serializable database will handle this situation scalably, safely, and automatically.