Hacker News new | ask | show | jobs
by grout 5319 days ago
sigh And here your PR guy wanted to fly me to New York to talk to you. Obviously the effort would have been wasted.

Replication events are too important to be designed to fall on the floor. I want at any given time to know approximately how far behind replication is; I want primary write events to block until replication is at least durably scheduled; and if I pause or slow operation and wait for replication to catch up, I want to know that it _is_ caught up without holes. Cassandra fails utterly to meet these minimal requirements.

Master/slave queue is not the only way to meet these needs, but unless a replacement can fulfill the requirements, I can't responsibly switch.

2 comments

When I'm learning about a new architecture, I like to take the position of, "let's assume the authors aren't idiots. If they're not, why would they have designed things this way?"

With that in mind, let me pose this question to you.

Is using a Dynamo architecture (http://www.allthingsdistributed.com/2007/10/amazons_dynamo.h...) for S3 "irresponsible" of Amazon?

I submit that by this point, Amazon (among others) has convincingly demonstrated that this approach can indeed achieve a high degree of reliability.

If you agree, then I suggest that you read through the Dynamo and Eventually Consistent papers again with the "let's assume these people aren't idiots" approach, and see if you can spot what this architecture offers to achieve a similar goal to your "wait for replication to catch up" design.

You're not idiots, and neither are Amazon. But using a Dynamo style design safely requires overprovisioning and performance loss. W=1 speed is your bait; reality is the switch.
Your primary argument seems to be that in the event a "replication event" fails, the data is lost forever - this is simply not true.

I the event a replica is unavailable for a write, the co-ordinator stores that write itself and delivers it to the replica once it becomes available again (Hinted Handoff, see the oft linked paper on eventual consistency).

This makes read-repair far less of an issue, usually only as a mechanism to ensure consistency of requests that occur in the window between the node becoming available and the hint being delivered.

It's called eventual consistency for a reason. Writes don't just go missing. If you're uncomfortable with the "eventual" aspect of the replication model then you're better off with a database that sacrifices availability for improved consistency guarantees.

Node failure is not the failure case in question. As long as the node is up, hinted handoff can't play any part.