| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by brianwhitman 5400 days ago
	the EN song data is dense in the sense that there is far more "columns" than rows in almost any bulk analysis -- average song unpacks to ~2000 segments, each with ~30 coefficients + global features. however, in paul's case here he's really just using MR as a quick way to do a parallel computation on many machines. There's no reduce step, it's just taking a single average from each individual song and not correlating anything or using any inter-song statistics.