| I actually disagree with the first recommendation as written - specifically, not to set a CPU resource request to a small amount. It's not always as harmful as it might sound to the novice. It's important to understand that CPU resource requests are used for scheduling and not for limiting. As the author suggests, this can be an issue when there is CPU contention, but on the other hand, it might not be. That's because memory limits are even more important than CPU requests when scheduling: most applications use far more memory as a proportion of overall host resources than CPU. Let's take an example. Suppose we have a 64GB worker node with 8 CPUs in it. Now suppose we have a number of pods to schedule on it, each with a memory limit of 2GB and a CPU request of 1 millicore (0.001CPU). On this node, we will be able to accommodate 32 such pods. Now suppose one of the pods gets busy. This pod can have all the idle CPU it wants! That's because it's a request and not a limit. Now suppose all of the pods become fully CPU contended. The way the Linux scheduler works is that it will use the CPU request as a relative weight with respect to the other processes in the parent cgroup. It doesn't matter that they're small as an absolute value; what matters is their relative proportion. So if they're all 1 millicore, they will all get equal time. In this example, we have 32 pods and 8 CPUs, so under full contention, each will get 0.25 CPU shares. So when I talk to customers about resource planning, I actually usually recommend that they start with low CPU reservation, and optimize for memory consumption until their workloads dictate otherwise. It does happen that particularly greedy pods are out there, but that's not the typical case - and for those that are, they will often allocate all of a worker's CPUs in which case you might as well dedicate nodes to them and forget about how to micromanage the situation. |