Hacker News new | ask | show | jobs
by Sparkyte 807 days ago
Kind of an old article. It is pretty straight forward thing to do. If you spend enough time accurately load testing your environments you can dial in the container resources and shave thousands of dollars. Lots of places are too scared of under allocating. Limit and request exist for a reason. Limit is for surge and request is what is always guaranteed. It is okay to exceed your request as long as you balance add a scaling policy to balance out the surge. And be cautious with request and limit on memory not all applications benefit from this.
2 comments

They're automatically predicting the limit _and_ figuring out binpacking into hyperthreaded CPUs and NUMA cores. K8s just pushes your supplied values down to the kernel, which is exactly what they're saying is inefficient.
It is indeed inefficient so this is more like a process lasso approach to the resource management?
If the number of servers needed for service A is proportional to the number of servers needed for service B-Z, then your whole cluster scales up and down together and you have a situation where the max cluster size is hit regularly instead of almost never. For private servers that’s a big problem. But if you’re a large enough customer for a cloud provider it can still be a problem.

You save money still, but you don’t solve your capacity problems by doing so.