Hacker News new | ask | show | jobs
by sethammons 2786 days ago
This was my first thought, you beat me to it. It seems like they contradict each other. Is it Raft with quorum or is it a single replica node that is caught up with master? You can't have both (unless you only have three nodes).
1 comments

> unless you only have three nodes

Interestingly enough this seems to be how managed sql solutions do it. You just have one primary and one failover replica in another zone. So a quorum write is just a synchronous write to both (2 out of 2). You don't have the split-brain problem because the original primary can't make progress if it can't contact the replica.

https://cloud.google.com/sql/docs/mysql/high-availability

https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Conce...

Github's problem is that they were trying to be too smart and allow any of their replicas to be candidate masters without increasing their quorum size. In theory this is higher availability than the managed sql solutions (they can be available even if their entire coast gets nuked) but they do it at the cost of consistency in the more common failure scenarios.