Hacker News new | ask | show | jobs
by quadruplebond 2933 days ago
Most machines will allow for back filling if your jobs are short enough and small enough. But that will still leave some of the machine under utilized.
1 comments

I don't have any customer data on hand at the moment, but I'd roughly describe it as basic scheduling on strict priority order getting the system to 80+% usage, and then backfill boosting that up to 90%+. Careful backfill tuning, and care in defining the priority structure gets you to 95+%.

97% is the highest specific value I can recall for any of the larger sites with a heavily mixed workload, absent having a nearly infinite supply of short+small jobs at hand to use to fill those gaps. And a lot of users aren't trying to chase that - for these large scale "capability" systems the goal is to scale out as large as you can anyways, the smaller stuff is usually relegated to "capacity" systems elsewhere with a less expensive architecture.

One thing that at least some schedulers can manage is the idea of a min+max runtime for a job, combined with a min+max node/cpu count. If you have users willing to 'scavenge' otherwise wasted time by running under such a regime that can put you closer to full usage.