| Author in the old thread (https://news.ycombinator.com/item?id=19262249) says > An x86-64 CPU has sixteen integer registers, but 100-200 physical integer registers. Every time an instruction writes to, say, RAX the renamer chooses an available physical register and does the write to it, recording the fact that RAX is now physical-register #137. This allows the breaking of dependency chains, thus allowing execution parallelism. I'm curious why they have so many more physical registers than... logical? registers. I have a couple of guesses: * Physical registers are physically cheaper to add than logical registers. * Adding logical registers breaks backwards compatibility, or at best means you get no speedup on things written (/compiled) for fewer logical registers. Adding physical registers lets you improve performance without recompiling. * Adding logical registers increases complexity for people writing assembly and/or compilers. Adding physical registers moves that complexity to people designing CPUs. Are some of these correct? Other reasons I'm missing? |
A compiler's IR is a directed graph, where nodes are instructions and tagged by an assigned register. It would be pleasant to assign each node a distinct register, but then machine instructions would be unacceptably large. So the compiler's register allocator compresses the graph, by finding nodes that do not interfere and assigning them the same register.
The CPU's register renamer then reinflates this graph, by inspecting the dataflow between instructions. If two instructions share a register tag, but the second instruction has no dependence on the first, then they may be assigned different physical registers.
`xor eax, eax` has no dependence on any instruction, so it can be specially recognized as allocating a new physical register. In this way of thinking, `xor eax, eax` doesn't zero anything, but is like malloc: it produces a fresh place to read/write, that doesn't alias anything else.