Hacker News new | ask | show | jobs
by unscaled 3482 days ago
I can see where this received wisdom is coming from: a counter-reaction to the common tendency we had well into the 90s to hand-optimize every procedure considered to be even remotely on the hot path. It didn't even have to be inline assembly: it could just be C code sprinkled with registers, Duff's devices and bit shifts.

That used to work well enough for non-portable code targeting a limited range of CPUs, but nowadays the gains are too little , the RoI is negative and these efforts may actually end up backfiring on you.

I guess we needed to spread the knowledge that "the compiler is smarter than you" even if it wasn't really accurate, just to stop people from doing crazy stuff out of pure inertia.

2 comments

I can see where this received wisdom is coming from: a counter-reaction to the common tendency we had well into the 90s to hand-optimize every procedure considered to be even remotely on the hot path. It didn't even have to be inline assembly: it could just be C code sprinkled with registers, Duff's devices and bit shifts.

That's not it at all. The original problem was that the compilers generated several orders of magnitude larger and slower code than what we could code in the demo scene, and other than processor or memory, made zero utilization of the hardware or DMA. And in the demo scene, if you're not getting the maximum performance out of the hardware, you might as well be dead -- "demo or die", as Chaos of Sanity (now Farbrausch) so famously put it.

Compilers didn't really catch up with us: the fastest and best they can do using hardware instead of just the CPU and RAM is CUDA Fortran (pgi Fortran compilers). I know of no compiler taking advantage of DMA or audio hardware, let alone co-processors like for example the Copper and the Blitter. Even on systems like PS3, the GCC compiler took zero advantage of the RSX chip -- it was just a generic PowerPC compiler.

Surely a compiler will sometimes beat a human by generating a perfectly or near perfectly scheduled sequence of instructions for a particular processor, but a human can write a generic piece of assembler code that will get really good performance across a range of different chips in a given processor family, and so still beat a compiler overall.

Maybe the wisdom should be "the compiler is saner than you."