Hacker News new | ask | show | jobs
by pjscott 5467 days ago
That's part of the big win of having job abstraction systems, like MapReduce: having the jobs wrapped in higher-level abstractions that make automated profiling and fancy scheduling feasible.

It doesn't even have to be anything that fancy. It could be along the lines of "run the job a hundred times on each type of machine, and measure throughput, latency, and energy usage; then try to allocate jobs to the best kind of processor for it." That shouldn't break anything.

1 comments

How do you tell what kind of job you have without running it a hundred times?

Or do you mean run small portions of a large job before deciding where to commit the rest?

I meant each type of job; sorry for the ambiguity. So, for example, if you have a job type that is "look over a chunk of text and return term frequencies", you could test this on a few hundred chunks of text, and then use this profiling information to guide the scheduling any other time you run a job of this type.
Google has this covered in spades: http://research.google.com/pubs/pub36575.html