|
|
|
|
|
by rsanders
2117 days ago
|
|
Removing CPU limits seems like a bad idea now that there's a kernel fix. But putting that aside... I don't understand why pods without CPU limits would cause unresponsive kubelets. For a long time now Kubernetes has allocated a slice for system services. While pods without CPU limits are allowed to burst, they are still limited to the amount of CPU allocated to kubernetes pods. Run "systemd-cgls" on a node and you'll see two toplevel slices: kubepods and system. The kubelet process lives within the system slice. If you run "kubectl describe <node>" you can see the resources set aside for system processes on the node. Processes in the system slice should always have (cpu_capacity - cpu_allocatable) available to share, no matter what happens in the kubepods slice. Capacity:
cpu: 8
ephemeral-storage: 83873772Ki
memory: 62907108Ki
Allocatable:
cpu: 7910m
ephemeral-storage: 76224326324
memory: 61890276Ki
pods: 58
Granted, it's not a large proportion of CPU. |
|
What I did on the distribution I work on, is tune the cgroup shares so control plane services are allocated CPU time ahead of pods (whether guarenteed, burstable, or best effort). We don't run anything as static containers, so this covers all the kube services, etcd, system services, etc.
Before this change in our distribution, IIRC, pods and control plane had equal waiting, which allowed the possibility for kubelet or other control plane services to be starved if the system was very busy.
There are also lots of other problems that can lead to kubelet bouncing between ready/not ready that we've observed which wouldn't be triggered by the limits.