Hacker News new | ask | show | jobs
by tomjohnson3 4658 days ago
one other thing to keep in mind is that the underlying algorithms (zab and raft) provide different guarantees.

for example, zookeeper/zab allows reading directly from followers with a guarantee to get at least a past value that won't be rolled back. this was one reason zookeer didn't use paxos:

https://cwiki.apache.org/confluence/display/ZOOKEEPER/Zab+vs...

in my understanding, raft doesn't allow reading directly from followers, because the follower logs may get repaired/rolled-back when a new leader is elected. (though i'm sure an implementation can tweak the protocol to provide this support.)

that said, raft has a lot of interesting applications, and, in my opinion, is definitely more understandable than the many versions of paxos. (implementing zab yourself, at this point, would be a futile exercise.)

i found the videos from the raft user study to be very well done (and easier to understand than even their paper):

raft: http://www.youtube.com/watch?v=JEpsBg0AO6o paxos: http://www.youtube.com/watch?v=YbZ3zDzDnrw

...however, i think they did paxos a disadvantage by not just focusing on multi-paxos (which is probably the most common implementation). but, it's certainly fair to say that info about paxos is spread out far and wide...with perhaps too many knobs to turn and implementation-related details to fill in yourself.

as a side note: i've just started implementing raft in a set of libraries (multiple languages) that will be open source - along with other protocols.

3 comments

One thing is not correct. Raft allows to read from followers. You will never be able to read uncommitted value in raft.
I'm the original author of go-raft (the implementation used in etcd) so I'll try to address some of the points in this thread.

Raft only updates the state of the system once log entries are committed to a quorum the local state will never be rolled back. Log entries can be thrown out but they haven't been committed to the local state so it doesn't matter.

You can read from the leader if you need to ensure linearizability but that will kill your read scaling. Another approach is to read locally and check that the local raft node isn't in a "candidate" state (which would mean that it hasn't received a heartbeat from the master within the last 150ms). That approach works for a lot of cases.

As far as implementing multi-paxos, the authors behind Google Chubby have talked about how there is a large divide between theoretical multi-paxos and actually implementing multi-paxos. Also, there aren't any standalone multi-paxos Go libraries available. I wrote go-raft at the time because there wasn't an alternative distributed consensus library in Go at the time.

Let me know if you need an extra pair of eyes on your Raft implementation or if you have any questions (ben@skylandlabs.com).

> raft doesn't allow reading directly from followers,

In raft all client connections to followers redirected to use the current master.

> i think they did paxos a disadvantage by not just focusing on multi-paxos

Raft is equivalent to (multi-)Paxos

Having followers redirect to the leader is how the paper describes the algorithm.

But I don't think there is anything stopping you from having followers service committed log entry reads, provided you're willing to live with being out-of-date.

i don't think this is the case. here is a link to a video by one of the authors about log repair during leader election:

http://www.youtube.com/watch?feature=player_detailpage&v=YbZ...

log entries on a follower may get rolled back - and thrown out - since they were not accepted on a majority of followers.

Does this mean that there is no read scaling?

The master server must be able to handle the entire read load?