|
I'm not an expert, and it sounds like you are, so I appreciate your feedback here: what do you even mean by a consistent state? even in theory a person initiating a new additional record in Auckland, New Zealand at the same time somebody iniatiates a change in Gibraltar or London (which are antipodal to the former[1]) 66 milliseconds away, cannot have a confirmation in less than 120 milliseconds, right? So do you just wait for that before declaring 'consistency'? Do you literally add 120 milliseconds to each and every request? (And this is assuming you have a damned good solution to the two generals problem)? I mean suppose the database tracks something as simple as: number of web page hits. It's a counter. You now distribute it, and have a stochastic process of counter hits between 0 and 5 per second in your largest cities, distributed throughout the world. How can that database ever be consistent? If there are ten new records per second in New Zealand and ten records per second in the UK, and they potentially depend on each other in some way, are you going to just make everyone wait until everything has been committed and confirmed to be consistent? Or is "a foolish consistency the hobgoblin of little minds", and you really can accept out-of-date data and deal with merging conflicts later? I just don't understand why we would expect consistency to rank up there, when we deal with a worldwide real-time system where the difference between getting served by a local database in 40 milliesconds and one far away in 250 milliseconds is both staggering, and incredibly noticeable. why be consistent? what is consistence? [1] http://www.findlatitudeandlongitude.com/antipode-map/ |
Yap, you have to add all the delay until you get confirmation from the members of the cluster that the write was written. And you have to have linearizability, so that anyone reading after that (and one can argue what 'that' and where 'that' is) will now see the new state. If one of the members has failed you could potentially be stuck forever waiting. Now you also have to make sure how cluster membership and connectivity is defines and what are the possible state and transition during membership change, coupled with network partitions, coupled with hardware failures.
In other words you are fighting against the laws of physics. It is expensive and hard to do.
In case of the counters, one should ask is it worth it. Or is an CRDT based counter (that will eventually converge) good enough.
Even banks are eventually consistent. They choose to be available first. So you can withdraw $100 in New Zeland and then $100 in New York with a short period of time even if you only have $100 in your account. Inconsistency is handled later when you get a letter that your account is overdrawn.