Hacker News new | ask | show | jobs
by akeefer 4888 days ago
While I appreciate how thorough the article is, it's a bit of a strawman. Pretty much everyone who makes use of a relational database in a professional capacity has to be aware of what transaction isolation level they're using, make their own choice about what to use, and then do things like acquire explicit update locks or do optimistic locking in order to ensure data integrity. But that doesn't mean that the ACID properties are useless merely because you have to do that; it might mean you have to think about a few things more than you'd like to, but it's still a different world than trying to mimic ACID properties in a NoSQL database, and there are still fairly hard guarantees about things like consistency that you get with other isolation levels. For example, with read committed or snapshot isolation, I still have transactionality and can be sure that if I issue a sequence of 10 updates in a single transaction and then commit it, any other single query is either seeing the results of all 10 or of none of them. That's an important guarantee in many situations, and it's a guarantee that I can use to make decisions about how I structure my application logic.

The author of the post basically seems to treat any isolation level below serializability as some sort of sham perpetrated on the development community, and that's not the case: they're still useful, and they're still something that you can use to build the sorts of application guarantees you want. The mere fact that pretty much every database vendor gives you a choice as to what isolation level to use should be a pretty obvious clue that there's no one-size-fits-all answer there, so harping on the non-existence of a serializable isolation level is somewhat missing the point.

2 comments

I think it is extremely hard to reason about data integrity properties in non-ACID systems.

Consider a banking application that uses linked accounts such that transactions fail if the combined balance of linked accounts is below zero. If I implement this in Oracle using Snapshot Isolation (the highest isolation level Oracle offers) in the obvious manner, I'll get silent constraint violations: transactions that should fail will succeed because MVCC can't stop two competing transactions debiting the same account pair (i.e., I have two linked accounts A & B with $100 in each and I launch two simultaneous transactions to transfer $150 away from both: afterwards, I'll end up with -$50 in both or some garbage data).

Now, you can fix if you recognise the problem in advance by doing a select-for-update or changing your schema to materialize the constraint (say by creating a linked-balance table that holds the combined balance for all pairs of linked accounts).

But it is really hard to even notice the problem, especially if you've got a few dozen tables with multiple applications writing to your database (RDBMS advocates insist that this is a good thing). And there are no automated methods for determining when this will be a problem: you just get silent data corruption or silent constraint violations in your extremely expensive "ACID-compliant" (but not really) Oracle database.

While I agree that it's a hard problem, I'm not sure I'd agree that it's a hard problem to notice. It's such a common problem for anyone dealing with a relational database that I (and pretty much every engineer I work with) would recognize that sort of a problem immediately, in the same way that I always have app-level threading concerns in the back of my head when I'm working in a language like Java. I just take those sorts of things as a given, and my experience working with other engineers is that anyone else who's had to deal much with SQL has had to think about it as well. You develop a sense of when you need to worry about race conditions and when you don't.

Pretty much every ORM solution has some mechanism for dealing with this sort of a problem, as well. For example, the ORM we've built internally uses optimistic locking in most cases, which at least simplifies the problem space that our engineers have to worry about (i.e. they only have to worry about conflicts that can happen due to updates to two different rows, but any conflicts due to an update to the same row are automatically detected).

I'm also firmly in the "having multiple applications writing to the same database is sheer madness" camp, due to these (and other problems).

Again, I'm not saying that these things aren't annoying to think about, just that I think lots of people are used to thinking about them at this point, and it's a pretty well-understood and often well-communicated problem space with a set of known solution patterns, many of which are well-supported by ORM frameworks.

"MVCC can't stop two competing transactions debiting the same account pair (i.e., I have two linked accounts A & B with $100 in each and I launch two simultaneous transactions to transfer $150 away from both: afterwards, I'll end up with -$50 in both or some garbage data)."

That isn't true. In a pure MVCC world you could detect such issues, specifically that two trxs were opened and they both tried to update the same row. This would generate a conflict and one transaction could just get shot. Snapshot-isolation/MVCC describes this issue.

However what InnoDB/mysql and Clustrix actually do is use MVCC for read-write conflicts and two-phase-locking for write-write conflicts which trivially fixes your proposed problem.

To clarify the example, the two simultaneous transactions are updating different rows: one updates account A's row while the other updates account B's. There is no way for MVCC to recognize this conflict: it will happily corrupt your data.
Sorry, I'm not following. If you have 2 rows storing how much money you have in 1 account then your data isn't normalized and serializability isn't going to solve your problem, if you could be (even) more explicit, I might be able to answer.
I think we're conflating Snapshot Isolation and MVCC (e.g., Snapshot-isolation/MVCC). MVCC is a general concurrency control mechanism, not an isolation level. Coupling MVCC and Snapshot Isolation is like saying "using locks provides serializability," which is not true in general--it depends on how you use the locks.

That said, Snapshot Isolation can lead to anomalies like Example 1.2: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142... If you perform additional synchronization (e.g., 2PL, as you describe), then you can get serializability.

Yes, you're totally correct. I was thinking of SI.
The idea is that each account has a balance. Say I have a checking and savings account: there's one row for each and the bank will kindly let either of them get a negative balance provided the sum of both balances is positive.

You may not care for this schema design, but I think most people would be surprised that Oracle silently corrupts data and violates constraints here.

I'm not sure that people who use relational databases would be surprised that there can be data-level race conditions. For example, if you write code like: 1) User requests transfer in the UI 2) Query the DB to see if the transfer is legal 3) If so, update the database.

you have to think about race conditions even if the database implements serializability. First of all, you have to have enough of a clue to do steps 2 and 3 inside the same database transaction, or else the isolation level doesn't help you in the least. Everyone who deals with an RDBMS has to worry about transaction boundaries all the time anyway to avoid race conditions, so it's not too much of a leap to assume that people who are used to worrying about transaction boundaries will have to have some idea of what kind of problems those transaction boundaries will prevent for them. Most databases in production systems don't do full-on serializability as an isolation level (and if they do, it's too expensive to turn on), so pretty much every engineer's real-world experience is that if they do a query against table A, then an update against table B based on that data, that they can have a race condition and they need to lock A or materialize potential conflicts.

I guess it's possible that my personal experience or the engineers that I work with are not a representative sample, but if you're working with sensitive issues like money transfers you tend to be really, really super-paranoid about it, and there's no way that something like that is overlooked by a halfway competent team.

there are still fairly hard guarantees about things like consistency that you get with other isolation levels

What do you mean by consistency? I agree that there are many ways to ensure that application integrity constraints are not violated--without using serializability. My point is that, without ACID serializability, you'll have to do some extra work to ensure them in general [1].

The author of the post basically seems to treat any isolation level below serializability as some sort of sham perpetrated on the development community, and that's not the case

Weak isolation has a long tradition spanning decades [2] and is hardly a "sham." It's well known that weak isolation doesn't guarantee the ACID properties as traditionally defined. My point is that many databases don't provide ACID as they promise.

It's still a different world than trying to mimic ACID properties in a NoSQL database

In terms of current offerings, I'm sympathetic to this viewpoint, but you might be surprised how cheaply we can get properties like the transactional atomicity you mention.

In general, I'm curious how easily anyone's able to take their application level requirements and map them down into an isolation like "read committed," especially given how awkwardly they're defined [3] and how many corner cases there are.

[1] e.g., http://www.bailis.org/blog/when-is-acid-acid-rarely/#arbitra...

[2] "Granularity of Locks and Degrees of Consistency in a Shared Data Base," Jim Gray et al., 1976 http://diaswww.epfl.ch/courses/adms07/papers/GrayLocks.pdf

[3] e.g., http://www.bailis.org/blog/when-is-acid-acid-rarely/#weak-no...

Fundamentally, what matters is the set of invariants you want to preserve, and it's usually the case that some number can be preserved for you by the database and some can't and have to be dealt with at the application level. So by "consistency" I mean "some invariants that I care about will be preserved," but that doesn't mean all such invariants are.

For example, if I'm writing a process that merely queries a table to pull back the user's account balance in a single query, read committed isolation might be enough for that particular use case to have "consistency:" I know I'm always seeing some consistent balance, even if it might not reflect transactions that are currently in flight (thus giving me a different answer if I run the query it again in 2 seconds). That's still a better consistency guarantee than if I have a read dirty isolation level (or effectively no isolation), so it's still useful.

If I'm doing an actual update to account balances, however, that level of consistency is no longer good enough, obviously: if all updates hit the same rows in the same tables then snapshot isolation level might be good enough to avoid problems. And if that's not good enough, then I can acquire explicit update locks and such, if the conflict has a risk of update locks across different tables. Even in that case, though, I'll need to worry about application-level invariants (like "a person can only withdraw up to $300 per day.").

So my point is that even without serializable isolation the database can still guarantee some invariants for me, even if it can't guarantee all of them, and that the database can never really guarantee preservation of all of the invariants that matter to me no matter how strong it's guarantees, so I always have to think about what I'm going to handle at the application level anyway.

In the case of my company (which makes applications for insurance companies), we do have to think about those sorts of things, but again, we have to think of a ton of things anyway, and the division of labor between the app tier and the database tier is always something we have to worry about. We do things like build optimistic concurrency into our ORM layer to make most common cases easier to think about, and we have pretty well-defined transaction lifecycles, but for the most complicated cases we have to think about what the potential for race conditions in the database would be, just like we have to think about them at the application level, and then we have to decide how to handle them. Again, even a "true" ACID database wouldn't prevent us from having to do that work, because many of the invariants we want to preserve in the data aren't expressable in the database anyway.