Hacker News new | ask | show | jobs
by arcticbull 1999 days ago
Is distcc actually any faster than running the builds locally on a half decent machine? If I recall the latency associated with getting the file to the build machines often exceeded the local compilation duration. I remember being really excited about the idea, and really disappointed with the results.
5 comments

One use that I can think of (never tried it myself, though in theory it should work) is if you are trying to cross compile for a processor architecture that doesn't have decent speed (many embedded chips), and the package you are compiling is difficult to cross compile (where the build script makes decisions based on the local environment or is otherwise to get to play nice with a cross compiler).

In that case you can run the build on the target CPU, but instead of calling gcc locally it calls distcc pointing to a cross compiler installed on a fast machine. This can be useful if you are compiling a bunch of packages from a distribution.

I've used distcc in that way long time ago. I had an powerpc iBook (G4) and a x86 desktop (Pentium4). The desktop helped compile software on the iBook using distcc and a cross-compiler. It did work pretty well. I also used ccache in addition to that to cache build output. That was probably around 2005 when most (all?) cpus were still single-core.
Indeed, in my use (with the Parabola GNU/Linux-libre distro), this is the killer use-case of distcc. This is how most of Parabola's packages for MIPS were built, and how many for ARM are built.
We used to use distcc (2007 time period) for the daily builds of one of our large, in-house C++ products - order of 10M LOC.

In principle, it was a good use-case with a highly modular structure and a clearly defined but chunky build graph.

In practice, it did work, but throwing more hardware at the problem on a single host turned out to be faster than the existing distcc setup and had much reduced operational complexity.

We could probably have tuned it further with distcc plus the new hardware, but we achieved the performance target we were looking for.

This lines up with my experiences using it last time.
Running it locally will always be faster as long as your machine is not a bottleneck (#cores, ram, ...). I think the use-case for distcc et al is to enable less-powerful machines to run builds faster by levering other machines. That’s exactly what we use it for at work. Our developers have not-so-powerful laptops and with distcc/icecc they can utilize the power of our build agents in the server room.

Also interesting to read: https://github.com/StanfordSNR/gg

Or maybe if money is no object spin up a bunch of VMs in the cloud to compile.
At the moment, icecream is not suitable for this as it is super sensitive to network degradation, and I ran into few compilation stalls that way.
I was actually just playing around with distcc for the first time last night. Compiling ungoogled chromium normally takes my desktop about 12 hours, and using distcc to share the load with my laptop (with a gigabit connection to my desktop) took a little under 7 hours. It definitely improved the speed considerably for me.
We used it for several massive C++ apps ~2012 and yes it was much faster than our local machines, although our machines at the time weren't state of the art but also weren't bad either. We had a pretty amazing on-prem lab full of Blades that we used for distcc, so I'm sure that helped deliver the lightning.

One thing people always overlook though with distcc is that while you compile remote, you always link local. That was the bulk of our build time and wasn't the fault of distcc. Object files would come at lightning speed (compared to pure local build) but linking was still non-trivial.

These days I'd bet a modern AMD would beat distcc in our situation due to latency times.