Hacker News new | ask | show | jobs
by vlovich123 1001 days ago
That kind of thinking is a bit flawed unfortunately. You might hit your peak for 20 minutes a day but you’ve provisioned your system for that temporary worst case for the entire day and other services are paying that penalty. If it’s the only thing you’re running, maybe. But in practice there are other things you want to run on the machine to improve utilization rate (since services are not all hitting their peak simultaneous generally)

That’s why good modern allocators like mimalloc and tcmalloc return memory when they notice it’s going unused, so that other services running on the machine can access resources. And this is in c++ land where things are even more perf sensitive.

2 comments

Theoretically virtual memory and swap solve this problem really well. The OS is free to write the unused pages to disc to let other programs use the real memory.
Swap is horribly expensive and most hyperscalars run their servers without swap and set per-process memory limits, automatically killing workloads that go above their threshold..
swap is only expensive if you are using the swapped out memory. if you are in a case where a program is just holding on to pages it isn't using, swap is basically free. for most users, turning off swap is just losing performance since the OS can always use all of your RAM to cache disk access.
Swap is expensive compared to releasing unused memory back to the OS. The reason is that you spend memory and disk bandwidth writing “unused” data to disk. And that data could very well be unused RAM just sitting around in a memory allocator, which is effectively useless memory that you’re swapping because the allocator didn’t release it.

Zswap is always performance increasing. Swap to disk can be performance degrading (good implementations generally are not unless your working set is larger than your memory and you’re in thrashing) and certainly expensive $$ wise in that it wears out your SSD faster.

You seem to be thinking I’m arguing in absolute terms where all I’m saying is that swapping is a more expensive technique to try to reclaim that unused RAM vs the memory allocator doing it. It can be a useful low-effort technique, but it’s very coarse and more of a stop gap to recover inefficiencies in the system at a global level. Global inefficiency mitigation is generally not as optimally effective as more localized approaches.

Consider also that before the OS starts swapping, the OS is going to start purging disk caches since those are free for it to reload (executable code backed by file, page caches etc). These are second order system effects that are hard to reason about abstractly when you have a greedy view of “my application is the only one that matters”. This means that before you even hit swap, your large dark matter of dirty memory sitting in your allocator is making your disk accessed slower. And the kernel’s swap doesn’t distinguish working set memory from allocator so you’re hoping inherent temporal and spatial locality patterns interplay well so that you’re not trying to hand out an allocation for a swapped out block too frequently.

What if the other thing you're trying to run runs at the same time that your rails app is using peak memory? You have no choice but to have enough memory for peak load.

But if you really do need to cheap out you can generally configure your app server to kill idle worker processes, or bounce them on a schedule to return memory to the system, and hope.

So that’s generally not very likely. You’re going to have some time of day effects that are shared but true “peak” tends to be service dependent rather than something all your services experience simultaneously from what I’ve seen (YMMV).

Killing “idle” processes is also extremely expensive because you have to restart the process, reload all state, and doing graceful handoff is tricky.

It’s good to have graceful handoff for zero downtime upgrades, but I still say having your allocator return RAM is the cheapest and easiest option and something good modern allocators do for you automatically.

There is no one size fits all memory management technique. There are always tradeoffs. The scenario you are describing is not common for ruby apps. Ruby uses a memory management style that is suitable for most ruby workloads.

All the production quality app servers handle killing and and starting new worker processes gracefully and efficiently by forking a running process. Certainly there is some overhead, but that's why you don't underprovision memory, so you don't need to resort to that.