Hacker News new | ask | show | jobs
by quinthar 3537 days ago
To be clear, we're talking about functionality that is not implemented or fully designed. Today all transactions are committed on all nodes in the same order, which is a much simpler world. I agree, the multi-threaded replication case is a much more complex and interesting world, with much greater performance opportunities. Lots of exciting problems to solve when we get there!
1 comments

> Today all transactions are committed on all nodes in the same order, which is a much simpler world.

This is difficult to reconcile with:

> - For the highest performance, you can designate a transaction as "asynchronous" and the master will commit immediately

because if the leader crashes, a replica becomes leader and starts accepting writes, then the old leader recovers as a replica, without something like an epoch number it won't be able to tell that it has commits that the current leader doesn't (using a unique incrementing transaction number based on just a counter at the leader won't work, because it won't necessarily be unique across leader elections thanks to the asynchronous commits).

Ah, sorry for the confusion. Every transaction is given an incrementing ID by the leader, and every follower commits the transactions in ID order.

Furthermore, every commit has a running SHA hash of all prior commits (and every node keeps a history of the last few million commits). This way any two nodes can compare their journals to make sure they agree -- and if there is any split, then the cluster kicks that node out.

Basically, there is no scenario in which a node that commits a different transaction (or a transaction in a different order) is allowed to remain in the cluster.

I think this or something like this can probably work if you're okay with losing all data that wasn't acked by a majority (though I suspect actually recovering a divergent replica would be very difficult), but this doesn't work with the batch commit idea at all, does it? Seems like it enforces strict serial ordering of writes (even nonconflicting ones).