Hacker News new | ask | show | jobs
by b1gtuna 2409 days ago
I wonder how fast I can compile my code base with this. On my hexa-core i7-8850h, it often takes more than 4 hours to build everything in full throttle. And I do this quite often, so pain is definitely present. Given the network and disk i/o aren't the bottleneck, having more than 5 times the cores should theoretically reduce the build time at least by 3 folds, conservatively?
8 comments

Phoronix does compilation benchmarks (for the Linux kernel and LLVM), the existing Ryzen chips do perform quite well on them. The i5-8400 is probably the closest thing on the chart to your 8850h.

But there are diminishing returns to adding more cores past a certain point which will depend on your codebase and compiler. If your builds are at 100% CPU utilization most of the time then you will probably see pretty large gains, but sometimes a significant chunk of the time ends up being bottlenecked by single threaded performance.

https://www.phoronix.com/scan.php?page=article&item=ryzen-37...

> But there are diminishing returns to adding more cores past a certain point which will depend on your codebase and compiler. If your builds are at 100% CPU utilization most of the time then you will probably see pretty large gains, but sometimes a significant chunk of the time ends up being bottlenecked by single threaded performance.

You should check out Phoronix's Rome benchmarks. Compilers seem to love L3 cache, and the new Threadripper parts have 128MB of it. https://www.phoronix.com/scan.php?page=article&item=amd-epyc...

The Epyc 7502 in that chart is going to be roughly equivalent to the 32-core Threadripper 3 announced today. Both are 32 cores with 128MB of L3, but the Threadripper part has a much higher base & turbo clock speed so it'd compile even faster. Probably.

Does linux compilation take a few minutes? The chart there says Ryzen 3 2200G takes 242 seconds to compile the whole kernel. I find that difficult to believe.
Phoronix tests compiling the upstream default config which is pretty barebones. A normal kernel build for a desktop machine will take much longer because there are more modules enabled.
Thanks for the comparison. How do you know the 8850h is desktop equivalent to the i5-8400? I can't seem to find it in the chart.
Passmark benchmarks are usually a good indicator [0]. They're also from the same generation and have the same number of cores.

[0] https://www.cpubenchmark.net/compare/Intel-i5-8400-vs-Intel-...

Ah great, thanks. It's actually very helpful. I was actually looking at the Passmark score a week ago, but couldn't tell if it's trustworthy. Nice to get an endorsement.
Please don't mind me asking, but why do you compile your entire codebase often? What's the challenge of splitting it into units and compiling each of them independently when code is changed?
I suppose I should've provided more context. My build usually takes a few minutes only during the day, as the changes are small and they are broken into multiple packages / units. Once or twice a week, some changes trigger chain reaction and takes 4+ hours to build. This is painful because it eats up chunk of my productivity. And perhaps once a week, I build from scratch just to prove everything still can be built from scratch.
Just in case you aren't aware of it, "ccache" helps with this, and you might find it trustworthy enough for your typical weekly rebuild.
ccache has been huge for me -- can't recommend it enough. The next big win for me was externing common templates into its own translation unit: http://gameangst.com/?p=246

It's a cool little trick few people seem to know.

I read the article, but I'm still a little confused. Do you put common std templates into their own translation unit, or are you putting only your own user-defined templates into their own translation unit?
Both, although most of the heavy ones in my projects are from the application-layer/user-defined.

having strings with common vector/map/unordered_map/set/unordered_set template specializations help a bit (i.e basic_string<char>, uint64_t int64_t, int and uint)

My methodology wasn't very scientific: when I found a template being specialized at a low-level, I added it to my list. another heuristic is anything that templates off of std::string (basic_string<char>), char, uint64_t int64_t, int and uint are all pretty good candidates as the likelyhood of them being reused everywhere is high.

Are you building like FPGA bitstreams or something? 4 hours seems insanely long for software unless it is literally millions upon millions of lines of code
Maybe he #included boost.
Maybe it's an aggressively optimizing and correctness-checking C++ compiler. GCC, clang can take a long time depending on complexity of code and flags provided.
I've built bitstreams, but these days I build software for consumer electronics with screens running ARM Linux.
in large c++ projects it's really hard to avoid the situation where almost every file includes a couple of the same key headers. change one char in an important header and you have to rebuild the whole project.
Probably header only C++
I’ve experienced very substantial improvements in compilation times jumping from the 2700X to the 3900X. I suspect the 3950X should be even better, as long as there are not frequency issues.
Just curious, why do you have to "build everything" quite often? Normally it's enough to just build the parts that changed. Perhaps you can improve your build process instead of investing in new hardware?
Not the OP, but nightly build to prove it all still works is a pretty common thing.

Reasons for mysterious breakage:

- Compiler Updates - Dependencies getting lost - Code changes (you break things into parts, doesn't mean they work together now).

Yes. I don't do nightly, but weekly. Some changes still trigger long builds that last multiple hours.
binary stability in c++ is hard, especially with dynamic plugins. You need to make daily or at least weekly build from scratch to confirm everything works. Otherwise for example things might seem to work because you are changing a field in class named x through api with a different name for it, and it won't work when you use new version on both sides. These kinds of bugs are the worst.
For what it's worth, Linux kernel compilation took about 6~7 minutes with -j25 on Ryzen 3900X (12C/24T).
Huh? I can do a full kernel rebuild in ~40s on a 3900X.
Maybe because Arch has more modules enabled than the default config?
without running clean?

cause that's not even in the right ballpark for a stripped kernel config

Yes, in a clean tree that just got cloned, and the result does boot.
20 years ago, it took me only around 10 minutes on a single-core 32-bit CPU, with 1/1024th the RAM, and spinning rust for storage. How is it not even twice as fast today?
Maybe better and more time consuming optimizations? 2 mio LOC vs. 26 mio LOC probably doesn't help either. Maybe there is some bottleneck somehwere in the software or hardware?
I have no sense of how long that is. What would it take on a 4 thread Macbook Pro?
I have not tried installing Linux on a more recent MacBook Pro (USB-C generation) so I couldn't tell, but I remembered it took around an hour with Arch Linux's default config (e.g. "let's compile this and go have a breakfast & make a coffee and hope it's done")
Went to lunch and timed a compile of the linux kernel (v3.19) (default options) at 14 minutes on my 13" 2015 macbook pro. (3.1 GHz Intel Core i7, 16GB RAM).
depends which four thread MacBook pro...

as a rough reference, it took about 35 minutes to build the linux kernel on my xps 13 a few years ago. that computer has a 2C/4T kaby lake processor. your macbook pro might be a little faster if it doesn't have one of the ultra low power CPUs.

I think it might depend a lot on the compiler and codebase you use. I got myself a 3900x, and for compiling Rust code it's actually not that much of a speed-up as expected. A lot of things are done in serial fashion - e.g. compiling single crates and linking. The average utilization during compiling a large project was maybe between 40 and 60%.

When compiling LVVM however all cores where churning along at 100% utilization, so I expect a big speed-up there.

gcc tends to compile quite a bit faster on the new Ryzens vs Intel due to the large l3 cache, so you may get quite an improvement.
I don’t know your codebase but my experience with slow compiles when using ninja or make -j or whatever is that it has always been the overuse of code-generators or templates or something like that. A bit of strategic de-templating usually works wonders.