Hacker News new | ask | show | jobs
by pcwalton 3837 days ago
That would work if most applications spent all their time in a few hot spots. But, contrary to popular wisdom, that's usually not the case. Most applications have flat profiles (to steal a quote from DannyBee—but it matches my experience as well). They have flat profiles because people have spent a lot of time optimizing them. In this context—which is the norm—eliminating optimizations to save compile time and deferring optimization to a few "hot spots" has the effect of turning off optimization for the whole program.

It's common to write off compiler optimizations as unimportant, because they're invisible and people don't see them. They're also complex, which makes people predisposed to get rid of them in the name of "simplicity". But, for better or for worse, optimizing compilers are necessary complexity.

Optimizing compilers are not ubiquitous because compiler engineers just like to play with technology. They're ubiquitous because you need them.

1 comments

I agree optimization is important. So important that it should be pushed down into the hardware. Binaries should look almost like source code. But that's just my vision for what it's worth.
Hardware already does insane amounts of optimization. The modern superscalar out of order processor basically does it's own JIT from X86 into their own internal micro-ops. Reordering instructions on the go etc. That's another 2-10x speed difference on modern computers.
Completely agree :) But even better to push C code straight down to the hardware and let it crunch on that! Let it allocate a few thousand registers, or spawn off an FPGA compiler to create a few new instructions. Crazy?
Hardware doesn't work like this. You might want to read Hennessy and Patterson, and the original RISC I paper.

http://www.amazon.com/Computer-Architecture-Fifth-Edition-Qu...

http://www.cecs.pdx.edu/~alaa/ece587/papers/patterson_isca_1...

RISC created a huge local minimum by speeding up C code to the exclusion of other languages. I predict that eventually future processors will hide more features from the higher software levels (such as number of registers, instruction types and formats) in order to improve efficiency at the machine level. I think we are seeing this trend with GPUs already. Current CPUs don't do this because they have to maintain binary compatibility with a huge installed base. We can compare notes in a decade or so :-)
Current processors already do that. You don't see the true number of registers or the true instruction set/format of any modern Intel processor. x86 instructions are translated into micro-ops, so x86 is really just a compatibility layer.

I do agree that current processors optimize for C/C++ (although of course there are niche systems like Azul which optimize for other languages). It would be nice to have processor extensions that allow us get better GC performance, or better handling of immutable values. There's a chicken-and-egg problem getting there.

GPU's don't do any OoO processing like modern CPU's do. They also don't do any register renaming. They execute things really literally, up to the point where one has to manually put delay slots for pipelined stuff if one really writes the raw asm (Which the manufacturers tend to keep really hidden, in order to avoid the binary compatibility trap, see https://github.com/NervanaSystems/maxas as an example for third party assembler for nvidia Maxwell arch)

On GPU's the binary compatibility issue is solved by having the driver compile the shader/compute kernel before it's used. As an example nvidia uses PTX (see http://docs.nvidia.com/cuda/parallel-thread-execution/) as an intermediate language in CUDA which is then compiled by the runtime into the actual ASM.

On modern CPU's the register renaming has already decoupled the physical registers from the instruction set register. As an example modern haswell has over 100 registers per core.

> RISC created a huge local minimum by speeding up C code to the exclusion of other languages

Would you mind expanding this.