Hacker News new | ask | show | jobs
by dudeinjapan 973 days ago
What is missing from this picture is idleness. For example, suppose I have a SLO 10 sec job A and SLO 5 min job B. If I only get a few Bs sporadically, I may want to define queue X=A only, and queue Y=A,B to use the idle compute to process more As. In the wild, this is a delicate balancing act.
1 comments

Even if you get lots of Bs, if B takes 15s to run, and you ever get n_worker requests for B at the same time immediately followed by a single A, you lose, even with plenty capacity to spare.

You need either dedicated workers for low latency tasks or some sort of preemption to meet SLOs with such heterogeneous tasks.

Right you need to think about the queue wait time you are willing to tolerate, in addition the time it takes a job to run. If you arent willing to wait in queue then you will need to have idle capacity.