Hacker News new | ask | show | jobs
by minimaul 2497 days ago
Yeah, this is exactly my experience too.

I try to move over every six months to a year or so, and it's the same gripes every time at this point.

Driver support's reasonable now, and the desktop environments are generally solid enough, but things like mixed DPI work really badly on Linux, my browser nearly always tears when scrolling on my secondary display, etc.

But... the single biggest killer for me though is how badly Linux copes with very low amounts of free memory. Put 32G in a machine and it still periodically runs completely out under my dev workload and when that happens, the whole system becomes unusable and I have to hard reboot it. I'm not sure what macOS and Windows do differently, but it just doesn't happen on either of those two OSes.

I really want to have the freedom to pick and choose my hardware more, but at the moment I keep falling back to macOS.

It's a UNIX environment so it has the tooling I want and a solid GUI that works well.

8 comments

There are discussions happening on LKML right now about how to solve this. I don’t have a link handy, but saw them either here or on LWN recently.

I used to have that problem too, but it went away when I stopped using JetBrains products :). Not for any reason other than the contract I was working on ended.

Then I shall live in hope that it will be fixed :)

And I didn't even have a JetBrains product in the loop, it was a mix of virtualbox and VS code, along with browser, mail client, etc.

To elaborate a bit based on my understanding of the issue: VirtualBox seems like a great program to tickle the issue.

During a memory pressure scenario, the kernel starts looking around for things that it can get out of RAM to free up space. If swap is enabled and not saturated, paging out some data to disk is a likely option. Reducing disk cache size works too. But... when the usual candidates run out, things have to get more clever. Things like shared libraries can get paged out! If one of those pages is requested, it can be reloaded from disk. Or, in the VirtualBox case, the mmap'd disk image can be removed mostly from RAM and have those pages loaded from disk as needed. Performance sucks terribly, but it keeps trucking on.

The wrinkle in all of this is SSDs. The out-of-memory (OOM) killer heuristically watches the system and kills off processes that cause memory pressure problems. These heuristics, however, are expecting these page-in and page-out operations to be slow (as they were on HDDs). On newer SSDs, the disks are too fast to trip the OOM killer into action! This is why, when this problem manifests, your disk activity light goes on solid, even if you don't have swap enabled. The kernel is sitting there trying every trick in the book, and the OOM killer doesn't see what's happening. Every individual page fault is handled quickly, there's just waaaaaay too many of them.

Yep, this is an accurate description according to my understanding of the issue too.

The lesson I've recently learned is that, for now, swap is necessary on Linux machines with SSDs. I've enabled zswap and added a 4 GB swap file to my machine with 16 GB of RAM, and the problem hasn't reoccurred for me since then. Supposedly, the memory pressure measure in the kernel gets a more accurate reading when swap is enabled, but I don't know for sure that that's true. At the very least, you can page out the memory you're using the least instead of file-backed pages, which is what happens in memory pressure situations on SSDs (as opposed to OOM killing).

Wow, that's a really cool explanation. Although, I've had an SSD in my desktop for 8 years now, it's a bit sad to hear.
I saw it as more of a Java problem on Linux than a Jetbrains one with Android Studio.

When Android Studio is run with large code base on emulator, memory issues were frequent in Linux with halting issues. SO has several such cases.

No such issues with macOS, even with mutiple Jetbrains IDEs in parallel(Same memory config).

I wonder how Android Studio is doing on ChromeOS, considering many of those are low end machines. I'm sure they had to optimize it, but I assume the issue would persist till the Linux kernel itself is fixed.

Another alternative for people who still love macOS but can't tolerate the dumpster fire that is the butterfly keyboard is to get a new Mac Mini, then get a monitor of your choice and righteous clickety-clack buckling spring keyboard to go with it.
Seconding this, the Mini has really reduced my periodic urges to upgrade my 2013 MBP.
I find that for low RAM situations that zram is very handy and allows for a graceful reduction in performance under memory stress rather than a cliff edge that you get with a swap partition.
zram has been replaced with some other technology (z-something, can't remember the name) that also compresses swap and removed duplicate pages.
zswap?
And this is why many people like me stick to Mac. Yes there's a solution for it on Linux, but no I don't want to look for it, maintain it, and at some point when a new 'best solution' is available keep up to date with all that...
On Linux, the distributions do that for you. You don't have to, but if you want, you can.

On Mac, you can't, even if you want. So you won't see discussions like these, because Mac does not have that kind of visibility inside. If something is broken (and Mac has its share of broken things), you get to keep all the pieces.

I had the same issue. The solution I settled on (and have been very happy with) has been a Mac Mini as a polished front end/web browsing machine, and then a Threadripper workstation running Ubuntu that I ssh into and do all dev work on. The pleasure of OS X without being so limited by Apple’s hardware options (and extreme markup).
How big is your swapfile?
I'm really curious, what is your dev environment/setup like?
Have you tried earlyoom?
+1 to this as a workaround until the kernel finally addresses the issue. Earlyoom is a user space OOM-killer that kicks in before the system starts the mad paging dance.

https://github.com/rfjakob/earlyoom

Packages are available in Debian Stable (Buster), so they should be available in most child distros by now as well.

It seems likely that something can be done to make the behavior less perverse but I'm not convinced that the behavior CAN be better than something like earlyoom.

What makes earlyoom useful is the fact that you can tell the machine what is low value and likely to be problematic. I'm not sure that information can be determined automatically. I'm further not sure what a better strategy than start killing low value problematic processes when we reach a threshold looks like.

Do you have a swap partition?

I've been in a couple of interesting discussions about Linux memory management lately that enlightened me somewhat, and I won't claim to be an expert now, but I've been around the low-memory block enough to understand now that, there's no simple right answer to the question of "Do you have swap?"

"The Linux kernel has overcommit baked into the fiber of its being." I've begun to understand that this idea is so deeply engrained in the kernel that in a multi-tenant or desktop workstation, you simply can't extract it back out and "just provide enough RAM," unless you know the performance characteristics and you really mean it when you say "that should be enough RAM." If you don't have any swap and the kernel starts to run out of memory, it's going to start evicting whatever pages it can back to disk.

(Wait, pages back to disk? I told you I didn't have swap) Yes – the linux kernel can page things back to disk even if you don't have swap, remember all of the binaries you're running have originally come from that disk, and the kernel knows it doesn't strictly need to have them in memory until they are volatile, or you tried to read those pages again.

Having some swap gives the kernel something else to evict, so have a healthy amount of swap and Linux will find the occasion to use it for the least frequently used pages that are not already on disk. This will improve your "nearly out of memory" performance.

The second worst thing that you can do is put your swap on fast SSD or NVMe, and it's not why you think. The kernel is making decisions based on a heuristic which is complicated and well-documented, but inscrutable. If the solid disk is 50x faster than the spinning disk that the swap was originally designed to use, then swapping will cost less overall and the heuristic will lean on it as a strategy to keep the OOM killer away even more often. You may find your cache recycle rates going through the roof because things can be paged out to disk and re-loaded faster than should be possible. I don't fully understand this part, but I suspect the answer is "try to use Swap less, and be aware of when you are using it."

The kernel does really not want to kill off your processes, and it has more opportunities than ever to ensure it keeps too many balls in the air when you have asked it to do so. So, find a way to stay ahead of the kernel and know better. If you have a dock widget that tells when you are going above 50% swap usage, you can close some tabs before it gets to be an unrecoverable situation. It's a mystery to me why modern computers don't come with disk activity lights, as this problem we didn't need dock widgets to solve 20 years ago when literally every computer came equipped with one.

The best advice is to have enough RAM for whatever you're doing, and at 32GB "I think you've had enough." At any rate the one suggestion that I could give is, if you anticipate running out of memory (ever, and it looks like you still do), then you should be sure to have a healthy amount of swap, to me that's probably at least 5 or 6GB but YMMV.

But, 32GB for a desktop workstation really ought to be enough IMHO, so try to find a way that you don't run out? If you're eating all that memory up with VMs, try a lighter weight solution for your ephemeral workloads like footloose, which behaves like a VM in the ways you generally tend to want for your dev workloads, (like for example, it can run systemd like your deploy target most likely does, if you're using VMs to match the deploy target). Footloose doesn't impose the "VM's" whole footprint upfront due to actually being a container, so when you run out of memory it will be because your application workloads used too much, not because your virtual machine manager has grabbed much more than it needed.

All that being said, my daily driver is a Mac and I don't think about this stuff either, until it affects a server.