Hacker News new | ask | show | jobs
by RcouF1uZ4gsC 2723 days ago
> The test was done with the source code and includes on a regular hard drive, not an SSD.

In my opinion, this makes any conclusion dubious. If you really care about compile times in C++, step 0 is to make sure you have an adequate machine (at least quadcore CPU/ lot of RAM/SSD). If the choice is between spending programmer time trying to optimize compile times, versus spending a couple hundred dollars for an SSD, 99% of the time, spending money on an SSD will be the correct solution.

3 comments

> The test was performed by compiling the source code below 128 times, calculating the average time.

Presumably, 127/128 runs have both the test file and the single header file in memory cache, so the distinction is moot.

Also, I find the conclusion that we should all just buy top end machines and ignore performance problems that don't manifest there fairly unconvincing. I think that kind of thinking is responsible for a good chunk of the reason the web is so bloated today. :-)

There's a difference between development (read: build) performance and runtime performance.

For any kind of even vaguely profitable software, your developers should all have kick arse machines.

But they should test on a $200 laptop :)

Doom/Quake were developed on a NeXT machine much faster and more capable than the targeted IBM PCs.

You don't need to develop on a $200 notebook to care about performance.

Its that RAM that is the key, you want enough to keep all the source files and intermediate files sitting in cache, so the only disk activity is updating timestamps and flushing the .O files to disk.

I've seen this problem a few times, someone looks at their N core machine with M GB and says, oh look i'm only using 3/4 of M so when I buy the 4xN cores machine I'm going to put M ram in it again. Then everything runs poorly because the disks are getting hammered now that there are another 32 jobs (or whatever) each consuming a GB. Keep adding ram until their is still free RAM during the build. Its going to run faster from ram that waiting for a super speedy disk to read the next c/.o/etc file.

Just to address part of your concern: Traditionally disk speed makes very little difference to compile times for real world C/C++ projects. This is because real world projects have many files, and each one can be compiled in parallel. Once you spawn sufficient compilers in parallel, the CPU becomes the bottleneck, not the disk. (I.e. when a compilation asks for I/O, it then yields the CPU to other compilers which have CPU work to do)

Note that Visual Studio, for example, does a poor job of this because it only spawns one compilation per CPU thread. This results in individual threads being idle more than they ought to be.

I guess it depends on how you define "very little", and what system includes you have.

I've just tested one of my ~300 KLOC C++ projects, broken into 479 .cpp files and 583 .h files.

Using Linux (GCC) after dropping the disk cache, on a 5400 RPM HD, the full build on 14 threads took: 78 seconds.

On a fast SSD (same machine, after dropping caches again) it took 61 seconds.

Linking was ~7 seconds faster on the SSD, so arguably you could say that actual compilation wasn't the same ratio as fast, but overall build time is most definitely faster.

Source was on the same drive as the build target directory.

At a previous company I worked at, we got SSDs to speed up compilation (and it did).

This very much depends upon the project. Have you seen the size of C++ object files with -g3 and lots of template usage? It can swallow tens of gigabytes of disc space only to have the linker elide most of it and give you a library or executable a few megabytes in size. Compared with the size of the inputs, the output is causing a vastly disproportionate amount of disc I/O, and this can end up being limiting, both during compilation and in particular during linking.
Absolutely not true. The problem is not the compilation of one single file, but that every one of these single files pulls in large amounts of headers, distributed over various libraries (e. g. Qt/Boost/STL), all of which won't fit into the disk cache.

If it doesn't make a difference, all that means is that your project is small, or doesn't have too many dependencies. Good for you. But that's not the reality for all projects.

My projects take 10 minutes to build on a modern system, which is plenty complicated enough. Don't appreciate the "good for you" flippancy.