Hacker News new | ask | show | jobs
by GauntletWizard 3533 days ago
Use Raft. Rather than speculating, learn, do, and guarantee you're doing the correct thing by electing once, distributing, achieving quorum and continuing. This is spitshine on a turd - PAXOS is a great protocol, but not a speedy one. It's an important building block, not something to be running constantly.
4 comments

Raft is more-or-less equivalent to Paxos. Its protocol is more detailed but no more efficient than the core, original Paxos protocol.
This implementation is an optimization of leader-based protocols like Raft. It removes the leader from the critical path using SDN.
More appropriately, don't implement these incredibly difficult protocols yourself - unless it's an exercise. Use one of the well maintained, widely used implementations.
If you took edX's Reliable Distributed Algorithms 1, you'd have implemented it in a few lines in Scala.
Well, in a few lines built atop kompics which is a simulation framework. (Multi) Paxos in particular is considered so hard to get right that no organization actually implements it to my understanding. They instead have their own flavor based on paxos. See zookeeper.
That's not really true at all. Basho (makers of Riak) have a multi-paxos implementation that is used in Riak. Riak_Ensemble:(https://github.com/basho/riak_ensemble)
Yes, that's the beauty of it - it deflects all the noise from you and you can focus on the essence of Paxos itself.

Instead of thinking how to write all the underlying abstractions yourself and likely drowning in them before you can understand what Paxos is about.

Surely, for production you have to deal with a different set of issues, like how many nodes can you handle at once before you need to send way too many messages, what happens if your socket gets full or unresponsive, what if you get into a distributed deadlock in some rare case (which always happens in production), how to recover from out of order messages, what if ACKs are missing but operation went through etc.

raft, view stamp replication, full active - active many writable nodes all have different tradeoffs in terms of the overhead. Basically it comes down to what is being replicated.