| Has anyone done a study on the optimal number of registers to have? The website answers the register question well, but leads to a further question: If registers are so great, why stick with just 16/32/64/n registers? Why not have more? After all, x86-64 and ARM64 decided that having more suited them. In the end it must come down to a compromise, with the downsides of having more registers possibly being some of the following: * Increased instruction set size (having to encode a larger register space in the bit patterns of each instruction) * Increased latency for interrupts? e.g. if your CPU has 1000 registers and an interrupt occurs, you're going to end up having to save all those 1000 registers somewhere. There could be some HW-assist but you'll pay the price somewhere. * Extra cost for saving registers in functions. Sure, depends upon the ABI as some registers will be 'scratch' and not preserved between function calls, but if you've got more registers you'll end up wanting to save more of them. * Algorithms might not need all the registers. I wonder what algorithm uses 20 live variables? 50? 100? etc. At some point, those extra registers could be unused. * Registers still need to be 'spilled' to memory. In an extreme case, you could imagine compiling a small program where every variable maps to a unique register. Ultimate speed! But asides from that optimal case, you'll end up still having to write registers back to memory. It makes no difference having 100 registers if you store the results of every computation... Anyway, that's all speculation. I was wondering if someone had done a study. You could construct a virtual, bespoke CPU with n registers, then make gcc compile some SPEC benchmarks using the ISA and model it to see how efficient having an extra register makes it. You could graph registers vs simulated runtime and see where the sweet spot is. |
http://arxiv.org/ftp/arxiv/papers/1205/1205.1871.pdf
Here's a good thread discussing this: https://groups.google.com/forum/#!searchin/comp.arch/number$...