Hacker News new | ask | show | jobs
by ngbronson 3365 days ago
The all-to-all dependency step between Calvin's sequencer layer and scheduler layer seems like it will be a problem as things scale, because it means that a single stalled sequencer [edit, orig: scheduler] blocks all writes in the system whether they conflict or not. This is the kind of dependence structure that magnifies outlier latencies and unavailability at scale.

In Spanner's design, on the other hand, a transaction can only be blocked by another one on which it actually conflicts. It will have worse average performance, but better tail latency as things scale.

Perhaps it is just a nit, but the blog post is somewhat inaccurate when it says that Spanner uses the commit timestamp to order transactions. It uses locks to order the transactions, then holds the locks for extra time to ensure that TrueTime order agrees with the lock-based order.

1 comments

You should think of the sequencer layer as a shared log abstraction along the lines of the Corfu project from Microsoft. It is distributed and replicated, with the scheduler layer reading from their local copy. Stalled scheduler nodes do not block writes in the system.
I mistyped, I meant to say that a stalled sequencer stalls all schedulers. It's true that Corfu has impressive per-sequencer throughput (by moving most of the work onto other nodes), but you have to move to multiple sequencers to get to Spanner scale.
Yes, you move to multiple sequencers, and you have them reach consensus via Paxos (to avoid problems from individual stalled sequencers).
So to get back to Spanner's availability (if you need it), you need the Calvin sequencer Paxos groups to span data centers. Since you're not exploiting the commutativity structure of transactions you need either a leader-based consensus implementation, which will have latency stalls when a leader becomes unavailable (amplified by the all-to-all communication), or you can use old-school 2 RTT Paxos, and your end latency ends up the same as Spanner.

Lest it seem like I'm not actually a fan of the log-based approach, let me point out a way in which Calvin crushes Spanner: write contention. So long as the Calvin transactions can encode the application logic, they can support extremely high write rates on contended objects in the database. Spanner's 2PL, on the other hand, has single-object object update rates visible to the naked eye.

Corfu looks interesting. I need a distributed, consistent, persistent log for a project. Corfu is quite complex (three different RAM-hungry Java components that each wants to be distributed), however. Do you know of any about other, more lightweight, but still decently scalable alternatives? I was thinking about Etcd, but apparently it's not designed for large amounts of data.