Hacker News new | ask | show | jobs
by userbinator 4024 days ago
Even if they weren't a single mnemonic, if you read the paper you'll find that reg = mem[reg + C], mem[reg + C] = reg, and reg = C appear to be the only 3 instruction types that are actually needed; these are only register-memory and register-constant data transfers. Even register-register isn't needed.
2 comments

register-register movs are needed because compilers are not the world's smartest code generators, and we don't have an infinite number of registers. The biggest problem arises when some calling convention expects certain things to be in certain registers - unless you do some pretty extreme long term planning in your register allocation code in the compiler, you're pretty likely to eventually wind up with some piece of information in a register that you need to move to another register before making or returning from a call.

Of course, modern processors don't really care about register-register moves because they have hundreds of hidden intermediate registers and register renaming which makes those types of moves "free." (e.g. Haswell has 168 GP registers, only 16 of which are exposed through the AMD64 ISA.)

Note that you don't need a dumb compiler, register pressure, or an instruction with physical register constraints to require a copy. Two-address instructions suffice: simply have a register for the first operand that contains a value that is used later. A copy is required to preserve the value stomped by the mutating instruction.

In any case I believe the point of the GP is that any necessary copy can be done by going through memory, so register-register moves can be dispensed with where minimalism is the goal.

Do you have a citation for the 168GP registers, just wondering if Intel have "offically" stated the number they use in a given microarch.
You can find an official Intel source for Haswell in here (page 15): http://pages.cs.wisc.edu/~rajwar/papers/ieee_micro_haswell.p...
I might be misreading this, but I think it's on slide 24 of this PDF http://www.hotchips.org/wp-content/uploads/hc_archives/hc25/...
Yes, and these are effectively 3 different instructions sharing the same mnenomic. If you assemble them:

        mov     0x1000(%rax,%rax), %eax
        mov     %eax, 0x1000(%rax,%rax)
        mov     $7, %eax
You get:

        0000000000000000	8b840000100000  	movl	0x1000(%rax,%rax), %eax
        0000000000000007	89840000100000  	movl	%eax, 0x1000(%rax,%rax)
        000000000000000e	b807000000      	movl	$0x7, %eax
Which is basically 0x8b->Load, 0x89->Store, 0xb8->Constant.