Hacker News new | ask | show | jobs
by jedberg 4043 days ago
Like I said, it depends on how you lay out your data. Let's say you have three data centers, and you lay our your data such that there is one copy in each datacenter (this is how Netflix does it for example).

You could then lose an entire datacenter (1/3 of the machines) and the cluster will just keep on running with no issues.

You could lose two datacenters (2/3s of the machines) and still serve reads as long as you're using READ ONE (which is what you should be doing most of the time).

1 comments

If you read and write at ONE (which I think NetFlix does) then this kind of works. Still with virtual nodes losing a single node in each DC leaves you with some portion of the keyspace inaccessible.

You're susceptible to total loss of data since at any given time there will be data that hasn't been replicated to another DC and you're OK with having inconsistent reads.

That works for some applications where the data isn't mission critical and (immediate) consistency doesn't matter but doesn't for many others. I'm not sure what exactly NetFlix puts in Cassandra but if e.g. it's used to record what people are watching then losing a few records or looking at a not fully consistent view of the data isn't a big deal...