|
|
|
|
|
by leif
4360 days ago
|
|
Yes, there are still problems with the election protocol, e.g. [1]. The right kind of network partitions can cause multiple primaries to stay up indefinitely, accepting writes on both sides of the partition, which will eventually be rolled back. There is another problem with the election protocol that allows writes acknowledged by a majority of machines to be rolled back after an election. Both of these problems can be fixed by using something like Raft[2] or Paxos for elections, rather than the ad hoc mechanisms used today. In TokuMX[3], we're currently working on replacing the election algorithm with something similar to Raft, that will eliminate these sources of data loss. We've heard that MongoDB is also working on fixing replication, but we don't know what their exact plans are (they have a bigger challenge since they need to stay compatible with their existing replication algorithms, which use timestamps as transaction identifiers) or whether these fixes will end up in 2.8 or in a later version. [1]: https://jira.mongodb.org/browse/SERVER-9848 [2]: https://ramcloud.stanford.edu/wiki/download/attachments/1137... [3]: http://docs.tokutek.com/tokumx |
|