Hacker News new | ask | show | jobs
by Thaxll 2122 days ago
It's pretty simple, limits work only when everyone are using them. If you have one pod that does not enforce limits it can disrupt the entire node.
2 comments

A container with a request but without a limit should be scheduled as Burstable, and it should only receive allocations in excess of its request when all other containers have had their demand <= request satisfied.

A container without either request or limit is twice-damned, and will be scheduled as BestEffort. The entire cgroup slice for all BestEffort pods is given a cpu.shares of 2 milliCPUs, and if the kernel scheduler is functioning well, no pod in there is going to disrupt the anything but other BestEffort pods with any amount of processor demand. Throw in a 64 thread busyloop and no Burstable or Guaranteed pods should notice much.

Of course that's the ideal. There is an observable difference between a process that relinquishes its scheduler slice and one that must be pre-empted. But I wouldn't call that a major disruption. Each pod will still be given its full requested share of CPU.

If that's not the case, I'd love to know!

Are you sure that BestEffort QOS do not disrupt the entire node? I remember in the past a single pod would freeze the entire VM.
I wrote a little fork+spinloop program w/100 subprocesses and deployed it with a low (100m) CPU request and no limit. It's certainly driving CPU usage to nearly all 8 of the 8 cores on the machine, but the other processes sharing the node are doing fine.

Prometheus scrapes of the kubelet have slowed down a bit, but are still under 400ms.

Prometheus scrape latency for the node kubelet has increased, but not it's still sub-500ms.

Note that this cluster (which is on EKS) does have system reserved resources.

    [root@ip-10-1-100-143 /]# cat /sys/fs/cgroup/cpu/system.slice/cpu.shares
    1024
    [root@ip-10-1-100-143 /]# cat /sys/fs/cgroup/cpu/kubepods/cpu.shares
    8099
    [root@ip-10-1-100-143 /]# cat /sys/fs/cgroup/cpu/user.slice/cpu.shares
    1024
Could you please elaborate on why's that so?