Hacker News new | ask | show | jobs
by gopalv 1836 days ago
> uses cases for Reed-Solomon at the application level

Hadoop has an RS implementation inside the filesystem (called "erasure coding"), instead of storing 3 copies of the same data, it can actually instead store ~1.5 copies as (6+3) or (10+4).

Previously, I've run into this tech in satellite internet gateways, but distributed filesystems is where I've gone through the math & probabilities of failure properly.

I work on perf & the extra network hops (with 3 replicas, you read 100% of data local, when you stripe it that doesn't work) and math for the error correction are hot spots when you are trying to keep all cores busy.