|
|
|
|
|
by nly
3779 days ago
|
|
> optimized scalar code performance has moved, what, maybe 40% over the last two decades I'm not convinced. Raw single-thread number crunching performance is somewhere around _two to three fold_, clock-for-clock, on Intel x86, over that of 10-15 years ago. What methodology do you use to attribute only a fraction of those gains to language optimizers? And even if you are correct, why is it meaningful? Who is going to have invested energy in optimising the shit out of mundane codegen when hardware performance will have just come and stolen your thunder a few months later? The problem we have now is that CPUs are gaining ever more complex behaviour, peculiarities, and sensitivities. I'd say compiler engineering is far from a "solved problem", even for statically-typed languages. |
|
With mainstream CPUs, exactly the opposite is happening. CPUs are getting more complex under the hood, but less sensitive to code quality. For example, a lot of the scheduling hazards in the P6 microarchitecture have been eliminated in subsequent iterations. Branch delay slots are a thing of the distant past, so are pipeline bubbles for taken branches, indirect branch prediction is extremely capable, even the penalty on unaligned accesses is minimal.