Hacker News new | ask | show | jobs
by saurik 5566 days ago
The increased durability based on snapshots is actually quite simple, and they explain it in various places: if one of the drives in Amazon's RAID fails, they need to bring up a new disk to replace it in the array. When they being up new disks they typically can do this instantaneously, because they really just dynamically page fault the drive from your latest snapshot. However, all dirty data since the last snapshot will have to be copied from the other drive(s). This is a window of time during which your array is exposed to unrecoverable read errors losing data. The less dirty data you have, the smaller this window of time.