| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by hardwaresofton 1069 days ago

Thanks for this insight -- this is one of the first time I've heard of someone constrasting FoundationDB and Cassandra which is nice.

> Example of fairly standard Cassandra bug (don't know if present on latest release, certainly was a year or two ago): When you add a new node to the cluster, it 'bootstraps', where it copies ~1/n the data from other nodes. When you are done bootstrapping, it's copied a bunch of data from other nodes, but the other nodes still contain that data. You then run 'cleanups' on the other nodes to remove the (now stale and unusable) data so as to get your disk space back.

Interesting, seems like there is a bunch of little knowledge like this needed to run a service properly... Managed Cassandra has more added value to provide I guess.

> If you accidentally run a cleanup on the new node as it is being bootstrapped, it will succeed, you will delete all the data that's been copied over so far, and Cassandra will _not_ terminate the bootstrap. Everything will be green, but your new node will suddenly be using 0 disk space. When the bootstrap finishes, possibly days later, your cluster will be immediately corrupted due to violated replication guarantees - but only on data that hasn't been read or written over that period, because if it was written it'll be re-replicated, and if it was read Cassandra will silently repair at this time. Repairs resolve the issue, but if you've made this mistake due to scripting, if you get unlucky it's possible to just delete all replicas of some data between repairs.

This seems... really bad -- I don't think I have the skill to run a Cassandra cluster (and not enough use cases to run it as a hobby to find these edges)...

This sounds like the space for a consultancy to make a tidy killing though.