Hacker News new | ask | show | jobs
by dwenzek 3951 days ago
I find this post a bit disappointing.

The subject is exciting and I wish to learn more on how to design an efficient and safe replication scheme on top of two coordination protocols, each with its core set of garanties and constraints.

The post gives a good overview of the trade-off between safety and performance made with in-sync replicas (ISR) compared to quorum acknowledgement.

But it remains very vague on how to deal with the problem stated in the "ZooKeeper and consensus" section: being consistent does not mean that the values read [by two workers] are the same necessarily but are only computed after consistent growing prefixes of the sequence of updates. I totally fail to understand the way proposed to break the tie.

I would expect the schema better shows late replica and even late views of the ISR. I would expect more evidences on how the system ensures a message produced to a consumer is never retracted.

1 comments

Thanks for your comments, and I'm sorry that you feel that the post does not match your expectation. If you write me directly, I'll be more than happy to clarify any question you may have. I'm not sure, for example, what is confusing you about the "zookeeper and consensus" section. I'm also not sure what kind of evidence you're after on published messages being lost.
Thanks to propose your help !

What I find vague in "ZooKeeper and consensus" is the answer to "Why does this proposal work compared to the original one?":

>> Because each of the workers has “proposed” a single value

Does the processes read or propose the value ?

>> and no changes to those values are supposed to occur.

Not suppose to ? How do you ensure that ?

>> Over time, as the configurator changes the value,

>> they can agree on the different values by running independent instances

>> of this simple agreement recipe.

Sorry, but I fail to see what recipe you speak about.

>> Does the processes read or propose the value ?

It does both, it first proposes by writing a sequential znode and then reads the children (all proposals are written under some parent znode). This is certainly assuming some experience with ZK, and I wonder if that's the problem. It was not the goal to go into a discussion of the ZooKeeper API, but I'm happy to clarify if this is what is preventing you from getting the point.

>> Not suppose to ? How do you ensure that ?

It is not supposed to in the sense that if this is implemented right, then the proposed values for each client won't change. The recipe guarantees it because it assumes that each client writes just once.

>> Sorry, but I fail to see what recipe you speak about.

I'm referring to these three steps: creating a sequential znode under a known parent, reading the children, picking the value in the znode with smallest sequence number.

Thanks for these explanations ! Things are much more clear now.

Indeed, I have no experience in programming zookeeper, even I use it a lot behind tools like kafka or storm.