Hacker News new | ask | show | jobs
by wickberg 2932 days ago
99% is a bit higher than most systems expect to run at - any mix of job sizes in the queue will tend to leave small gaps when nodes need to stay idle until jobs "fit" perfectly again. 95% is a pretty common target for these larger systems.

There isn't necessarily a lot of research into packing these better - the basic algorithms have been unchanged for quite some time, and a lot more effort goes into deciding how to prioritize different groups that are sharing access into the same system.

1 comments

Most machines will allow for back filling if your jobs are short enough and small enough. But that will still leave some of the machine under utilized.
I don't have any customer data on hand at the moment, but I'd roughly describe it as basic scheduling on strict priority order getting the system to 80+% usage, and then backfill boosting that up to 90%+. Careful backfill tuning, and care in defining the priority structure gets you to 95+%.

97% is the highest specific value I can recall for any of the larger sites with a heavily mixed workload, absent having a nearly infinite supply of short+small jobs at hand to use to fill those gaps. And a lot of users aren't trying to chase that - for these large scale "capability" systems the goal is to scale out as large as you can anyways, the smaller stuff is usually relegated to "capacity" systems elsewhere with a less expensive architecture.

One thing that at least some schedulers can manage is the idea of a min+max runtime for a job, combined with a min+max node/cpu count. If you have users willing to 'scavenge' otherwise wasted time by running under such a regime that can put you closer to full usage.