Hacker News new | ask | show | jobs
by pdeva1 2471 days ago
correct me if i am wrong here:

1. burger case, i do: count = count-1 if both txs see count=1, we get count=0 at the end.

2. i didnt say banking per se. can involve a simple billing system of a startup. or any critical data where you need to ensure you are reading accurate, uptodate values. maybe a leaderboard.

>You'll likely see a current value most of the time

'most' is not a guarantee :) either the system is designed with seeing uptodate values or not. and if 'some' of the time the value is stale, you have to program with low consistency in mind.

the marketing on yugabyte's page makes it seems it can replace db's like cassandra and give you a consistent view of your data. but if one is seeing stale values, you are back to coding like data is non-consistent

2 comments

Cassandra is designed for eventual consistency across multiple independent regions.
correct me if i am wrong here... i do: count = count-1 if both txs see count=1, we get count=0 at the end.

I, er, don't mean to be rude, but... both grogers and I have already explained that this idea is somewhat less than correct. The anomaly you're describing is called lost update, and is explicitly prohibited by both snapshot isolation and serializable isolation, the two isolation levels supported by YugaByte DB. Linearizability is not necessary to prevent lost updates. This is not only a theoretical fact, but supported by experimental evidence: we have extensively tested YugaByte DB for lost updates, and have not (yet) observed any.

i didnt say banking per se. can involve a simple billing system of a startup. or any critical data where you need to ensure you are reading accurate, uptodate values. maybe a leaderboard.

While linearizability may be a nice property for these systems to have, it is rarely necessary. Billing systems and leaderboards, like bank ledgers and shopping carts, are often designed as append-only ledgers with eventual consistency, employing sealing windows, compensating transactions, and time-shifting to handle late discovery of events. Others are designed as as reports periodically derived from some underlying datastore via, say, an ETL process; data may be not only milliseconds, but even days or weeks out of date. That's not to say this is a universal pattern, but I can think of a few dozen systems off the top of my head.

I know this because I've worked on several systems like this, including in fintech, and consulted for companies and government organizations building others. I can also offer some anecdotal experience here. For example, I have no fewer than three emails in my inbox from AWS's billing system informing me of missing data or other mistakes in the previous month's bill, accompanied by updated reports providing newer data. Last summer, my bank deferred visibility of some transaction data for over six weeks.

This might feel counter-intuitive, but since money is fungible and addition is commutative, it's one of the easiest things to work with in a stale or even eventually-consistent manner. Where you really want linearizability (or at least sequential consistency) is in domains where operations don't commute. Linearizability is particularly important where those non-commutative operations involve side-channels, but that's a longer discussion for another time.

'most' is not a guarantee

This is true, but there is a qualitative difference between a system which exhibits, say, a 5ms stale read once per thousand transactions during clock skew, and a 10-day stale read one in every two transactions all the time. I mention these numbers to provide a rough characterization of anomaly frequency and severity.

the marketing on yugabyte's page makes it seems it can replace db's like cassandra and give you a consistent view of your data. but if one is seeing stale values, you are back to coding like data is non-consistent

Linearizability is not the only form of consistency model, and many models have no realtime properties whatsoever. You might find https://jepsen.io/consistency informative.

thanks for clarifying. appreciate the detailed answer.