| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by strlen 5913 days ago

It's an interesting article, as fundamentally sound systems have been built which are "CA" within a single datacenter. BigTable is the premier example.

There are however several issues with this. First, eventual consistency means eventual consistency of data on the replicas; it doesn't mean that a client can't be presented with a consistent view. Dynamo-pattern systems use quorums to provide this. Even when a quorums aren't used (R + W < N), at least with Voldemort, the weak (no read-your-writes) form of eventual consistency is still only a failure condition: when the "coordinator" (first in the preference list) node for the specific key fails after a write has happened, but before the next read is made.

Second, a CA system means that the core switch (and there may be multiple within a datacenter, especially in the case of startups leasing colocation space or using a managed/"cloud" hosting provider) is now a single point of failure. The way to build around this is by building an "AP" layer on top of the "CA" layer, spanning multiple core switches. This is similar to Yahoo's PNUTS and Google's Spanner. Both of these had been multi-year projects, by companies with expertise in distributed computing, with very specific and limited requirements (i.e., they were building this for themselves, not selling them as general purpose solution). Which brings me to the next point:

"CA" consistency is much more difficult (that is, error prone) to implement than "AP" consistency. Two phase commit is one way to do so, but doesn't provide fault tolerance (it won't withstand the failure of the coordinator/master node). Paxos is one way to do so, but a high performance Paxos implementation requires leases and is still very tricky to implement. Again, it took years for Google to build, trouble shoot and perfect the Chubby/GFS/BigTable stack; the first version did not have a fault tolerant master and the query model is still much simpler than SQL.

That's why I am skeptical when people claim to be able to deliver to market (within months, not years) a commercial solution that provides strong consistency, fault tolerance, horizontal scalability (without a hard upper limit), supports multiple datacenters and still allows execution of SQL queries (even if without certain types of JOINs) with OLTP-suitable performance. That's not to say it's logically impossible, it's just a very bold claim to make.