Hacker News new | ask | show | jobs
by fizx 6488 days ago
I routinely run composite jobs of thousands of map-reduces. Imagine an iterative machine learning algorithm where each epoch is a map-reduce job. Imagine a meta-job that runs dozens of these. It's not to bad with Hadoop's job control system. Where cascading really would shine (haven't tried it) is when these jobs require data joins.

Edit: This could be a case of map-reduce fail(tm). But I don't think so. I imagine Google's PageRank computation takes more than ten iterations to converge, and each iteration is a map-reduce job.