Hacker News new | ask | show | jobs
Building Snowman Using Transaction Isolation Levels (bitesizedengineering.com)
30 points by mjoksimovic 1182 days ago
2 comments

Explaining isolation levels is tough, and this text does a solid job. Didn't really get the comic, though. Seems to me the second person should be happy they don't get promised a snowman that then doesn't actually appear!

If you're looking for a deeper dive on isolation, my favorite formalism is from Crooks et al "Seeing is believing" https://www.cs.cornell.edu/lorenzo/papers/Crooks17Seeing.pdf. What's nice about this is that it's client-centric, separating the implementation from what's really important - what the client can observe.

The other classic is Berenson et al "A critique of ANSI SQL Isolation Levels" https://www.microsoft.com/en-us/research/wp-content/uploads/... Despite the title, this paper does two things: very cleanly describes the relationships between isolation levels, and introduces the concept of snapshot isolation and how that relates to the other levels.

Atul Adya's "Generalized Isolation Level Definitions" (https://pmg.csail.mit.edu/papers/icde00.pdf) and other work is foundational here too, but not very approachable.

> Think of it as physically locking the database while you are doing stuff in it. It’s super-slow because there’s no concurrency, but it’s effective if you need to ensure data is exactly the same.

Critically, this isn't true! Databases at the "serializable" level can get a lot of concurrency for transactions that don't have overlapping read and write sets. For example, the classic serializability implementation, based on strict two-phase locking (SS2PL), only takes locks on rows (and predicates) that are read and modified, allowing concurrent transactions on disjoint sets without problems.

> For example, if you are 100% sure that two transactions would never touch the same row, or that they’d operate on different columns - you can always consider easing the isolation

I always thought of database row as an immutable tuple. It is not like row-oriented databases will allow you to operate on disjoint sets of columns as if they were different tables. Lower isolation levels lower than read committed will introduce same unexpected artifacts even if column sets don't intersect