Hacker News new | ask | show | jobs
by karolinepauls 193 days ago
> Python: ~60% waste (Mostly sized for startup spikes, then idles empty).

I understand we're talking about CPU in case of Python and memory for Java and Go. While anxious overprovisioning of memory is understandable, doing the same for CPU probably means lack of understanding of the difference between CPU limits and CPU requests.

Since I've been out of DevOps for a few years, is there ever a reason not to give each container the ability to spike up to 100% of 1 core? Scheduling of mass container startup should be a solved problem by now.

3 comments

I don't think there is. You should set both and limit doesn't need to match request for CPU.

Your limit should roughly be "what should this application use if it goes full bore" and your request should be "what does this use at steady state".

At least at my company, the cluster barely uses any CPU even during the busiest hours. We are fairly over provisioned there because a lot of devs are keeping limit and request the same.

CPU bursting is safe you just get throttled. Memory bursting is dangerous you get OOMKilled.

That's why Python numbers look so bad here devs set the request high enough to cover that initial model loading spike so they don't crash during a rollout, even if they idle at 10% usage afterwards.

What makes you think they're talking about CPU? It reads to me like it's memory.
Two things - the word "idles" and the nature of CPython's allocator which generally doesn't return memory to the OS but reuses it internally. So you cannot really "spike" memory usage, only grow it.