|
|
|
|
|
by t90fan
1060 days ago
|
|
Running regular incremental repairs is the norm, as nodes will from time to time have trouble talking to each other due to real world network reasons, or will go down, for things like OS patching. We had a (daily) cron job for it. I come from the software side not the DBA side of things but my main advice from running Cassandra at scale in production (it was part of an Apigee stack) is don't basically! It was very not realisable, would consume huge volumes of memory (especially during repairs), bandwidth (doing a repair is very chatty as it has to sync lots of data) and disk space (tombstoning meant deleted records take up space until compaction runs), and was generally not much fun to manage, and it was difficult to hire people who knew much about it to do so. I would not build a solution myself using it going forward. We also had to periodically (weekly) do "full" repairs to work around Cassandra bugs, silent data corruption etc... |
|
https://www.youtube.com/watch?v=0QsLU9na2uE
But yes, you win some (mainly resilience, availability and disaster avoidance, possibly tunable consistency will help you) you lose some.