| HN Mirror

Anything that horizontally scales. Remember that scheduling pressure doesn't only add nodes, you can also preempt lower priority workloads.

So maybe you have an application server that uses 1 CPU and 1G of RAM per instance, and can handle 10,000 requests per second. If you are getting 30,000 requests per second, then you'll want at least 2 more replicas to handle the load with an acceptable response time. You also run fuzz tests in the background at a very low priority. So scheduling 2 more application server replicas will causes those jobs to be preempted, and give your application 3 CPUs and 3G of RAM.

Basically, with this sort of autoscaling, you are always using 100% of your computers for something, but when there is some business to do or money to be made, you can give the revenue critical stuff priority.

As always, there are technicalities as to why you wouldn't want to do this. Maybe you think your fuzz testing is going to find a container escape and destroy the VM that it's running on, so you don't want production traffic anywhere near it. Or maybe the 10,000 requests per second that your application server can handle with 1 CPU actually uses all of the network capacity on the node, so you have to scale across other machines in order to handle any more requests. It all depends, but the flexibility is there to get yourself high utilization of your physical hardware.