Hacker News new | ask | show | jobs
by PragmaticPulp 1963 days ago
SSDs help, but nothing beats core count X clock speed when compiling.

Source code files are relatively small and modern OSes are very good at caching. I ran out of SSD on my build server a while ago and had to use a mechanical HDD. To my surprise, it didn’t impact build times as much as I thought it would.

2 comments

I did a test a while back where I had a workstation compiling linux with SSD and one with a HDD -- it turns out all the files were cached in the memory (measely 8gb). But for general usage and user experience I would reccomend SSD without any question.
Hmm. Maybe the tradeoff has changed since I last tested this (to be fair, a few years ago). But I'm also not focused on build servers especially, it's always been possible to make those reasonably fast. Unless you have a very specific sort of workflow anyway, your devs are doing way more local builds than on the server and that sped up a ton moving to SSD, in my experience anyway. YMMV of course.
Last time I benchmarked C++ compilation on SSDs vs HDDS (compiling the PCL project on Linux which took around 45 minutes), SSD didn't help in a noticeable fashion.

I believe that this makes sense:

In a typical C++ project like that which use template libraries like CGAL, compilation of a single file takes up to 30 seconds of CPU-only time. Even though each file (thanks to lack of sensible module system) churns through 500 MB of raw text includes, that's not a lot over 30 seconds, and the includes are usually from the same files, which means they are in the OS buffer cache anyway so they are read from RAM.

However, if the project uses C++ like C, compilation is a lot faster; e.g. for Linux kernel C builds, files can scroll by faster than a spinning disk seek time would allow.

Back in 2012 at a previous job we tested compilation performance on spinning disks versus solid state. On Linux it made almost no difference what so ever, on Windows however it was a game changer. The builds were an order of magnitude faster so it was well worth it making the switch.
Well, you can always move your code to a ramdisk, I suspect any C(++) isn't more than a few GB anyway ?
Most compilers won't fsync will they? The output is likely not being written straight to disk.

It's caches all the way down.

And to max out disk bandwidth before you max out your CPU cores you need a really terrible disk.

I don't think many 8-way Xeon Platinum boxes have eMMC storage.

Well, using ramdisks will let you compare and make sure the disk isn't the bottleneck, at least.
But all compilers will close(). Compare compile times on tmpfs and you will see an improvement.
You don't even need to do this just cat all the files to /dev/null to prime the cache
With gentoo i use tmpfs and it works really well
Do you have details on how to enable this?
Just mount a ramdisk over your portage TMPDIR (default /var/tmp/portage). You will need a decent amount of ram though for larger packages like LLVM. Disabling debug symbols will reduce the required space a bit.