| HN Mirror

Citation needed! :)

Joking snark aside, I'm actually doubtful this is true. Specifically, I don't recall the impetus for Hadoop (or Google's original Map-Reduce, as described in the '04 paper) being an all-in-memory workload.

Despite it being repeatedly brought up in this sub-thread, I maintain that it's a niche use case and that disk-based data processing workloads are far more common.

ETA: Does anyone know of a canonical or early/initial document outlining the purpose, or at least design goals, of Hadoop?