| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by notmyname 3541 days ago

As a quick answer, the name comes from being able to recover data when some of it is "erased".

The only way to durably store data so that it survives a hardware failure (e.g. drive dying) is to store more than one copy. Full replicas are the simplest way to do this, but you've got a relatively high overhead (e.g. Store 1GB of data with 3x replicas, and you store 3GB of data). Erasure codes are a way to effectively store fractional replicas, so you only use 1.5x or 1.7x of the original data.

Erasure codes are great when you've got a lot of data and you need high durability but don't want to pay for the storage space required for full replicas.

Why don't we always use erasure codes for everything? EC isn't great when you've got small bits of data, and since there's a bit of math involved in reading and writing the EC data, EC has higher latency than simple replicas.

https://www.swiftstack.com/blog/2015/04/20/the-foundations-o... is a great into to how erasure codes work.