|
RocksDB is a fork of LevelDB, which was [in]famous for its ease of corrupting data. Did Facebook ever do anything to ensure data wouldn't corrupt, or is that still a common thing operationally? (You find it more at larger scales) Here's an example of how data corruption can suck, with (example) Riak and LevelDB. The leveldb data would corrupt often, which would leave you in a predicament. Say you had 10 nodes with a 3 node replication factor, and the whole cluster is humming away at a decent clip. Now one node's leveldb corrupts, and you have to rebuild it. If you have a huge fuckoff dataset, this can take a while. Now another node goes down. Now only 1 node has the data you need, and 2 nodes are down - so now 8 nodes are doing the work of 10, and if you have any more failures, your data might be gone. Now add replication, which will suck performance and bandwidth away from the regular work. And because it would corrupt so easily & often, there needed to be hash trees to quickly identify what data was corrupt, and then you needed to fix it and rebuild your hash trees. This would also suck away performance. Finally, you can't just add new nodes while rebuilding, because the extra load makes the cluster fall over. And the more nodes, the higher the likelihood of failures. |