Hacker News new | ask | show | jobs
by freels 2607 days ago
This take on Calvin is inaccurate:

Under contention, a Calvin-based system will behave similarly to others which use optimistic locking schemes for Serializable isolation such as Postgres, or YB itself. There are advantages to the Calvin approach as well. For example, under Calvin, the system doesn't have to write out speculative intents to all data replicas in order to detect contention: The only writes to data storage are successful ones. The original paper only describes this briefly, but you can read about how FaunaDB has implemented it in more detail: https://fauna.com/blog/consistency-without-clocks-faunadb-tr...

It's also not a stretch to see how the protocol described in that post can be extended to support session transactions: Rather than executing a transaction per request, the transaction context is maintained for the life of the session and then dispatched to Calvin on commit. (This is in fact how we are implementing it in our SQL API feature.)

I would instead say that one of the more significant differences between Calvin and Spanner is the latter's much stricter requirements it places on its hardware (i.e. clock accuracy) in order to maintain its correctness guarantees; a weakness its variants also share.

1 comments

Only if things were that simple :) Calvin avoids the need to track clock accuracy by making every transaction go through a single consensus leader which inherently becomes a single point of bottleneck for performance and availability. Spanner and its derivatives including YugaByte DB chose not to introduce such a bottleneck in their architecture with the use of per-shard consensus — this means caring about clock skews for multi-shard transactions and not allowing such transactions to proceed on nodes that exhibit skew more than the max configured. The big question is which is more acceptable: lower performance & availability on the normal path OR handling offending nodes with high clock skew on the exceptions path?
Sorry I can't let this go unchallenged. Again, you are inventing an architectural deficiency where the is none. The log in Calvin is partitioned and does not require all transactions to go through a single physical consensus leader. There is no single node in a Calvin cluster which must handle the entire global stream of transactions. The Calvin paper itself extensively covers how this works in detail: http://cs.yale.edu/homes/thomson/publications/calvin-sigmod1...