Hacker News new | ask | show | jobs
by nly 3779 days ago
> optimized scalar code performance has moved, what, maybe 40% over the last two decades

I'm not convinced. Raw single-thread number crunching performance is somewhere around _two to three fold_, clock-for-clock, on Intel x86, over that of 10-15 years ago. What methodology do you use to attribute only a fraction of those gains to language optimizers? And even if you are correct, why is it meaningful? Who is going to have invested energy in optimising the shit out of mundane codegen when hardware performance will have just come and stolen your thunder a few months later?

The problem we have now is that CPUs are gaining ever more complex behaviour, peculiarities, and sensitivities. I'd say compiler engineering is far from a "solved problem", even for statically-typed languages.

1 comments

> The problem we have now is that CPUs are gaining ever more complex behaviour, peculiarities, and sensitivities.

With mainstream CPUs, exactly the opposite is happening. CPUs are getting more complex under the hood, but less sensitive to code quality. For example, a lot of the scheduling hazards in the P6 microarchitecture have been eliminated in subsequent iterations. Branch delay slots are a thing of the distant past, so are pipeline bubbles for taken branches, indirect branch prediction is extremely capable, even the penalty on unaligned accesses is minimal.

Well, sure, but SIMD more than compensates for all of that, given how hard autovectorization is. In fact, I think with things like AVX and NEON becoming ubiquitous, you can get more benefit out of writing in assembly (or intrinsics) than any time I can think of in the past 10 years.