| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pslam 4024 days ago
	That's a fun program. Similar techniques to this are used to create "ROP" (return-oriented programming) exploits. You get enough building blocks to get general purpose programming, and it's "just" matter of mapping to those building blocks. X86 "mov" is a bit of a cheat, though. It's a single mnemonic for what's really dozens of quite different instructions. For example, ARM has "mov" for register-register operations, but "ldr/str" for register-memory, while X86 just uses mov for everything. Even ARM mov optionally puts a barrel shifter into the path, so it's not that pure either.

3 comments

userbinator 4024 days ago

Even if they weren't a single mnemonic, if you read the paper you'll find that reg = mem[reg + C], mem[reg + C] = reg, and reg = C appear to be the only 3 instruction types that are actually needed; these are only register-memory and register-constant data transfers. Even register-register isn't needed.

link

awalton 4024 days ago

register-register movs are needed because compilers are not the world's smartest code generators, and we don't have an infinite number of registers. The biggest problem arises when some calling convention expects certain things to be in certain registers - unless you do some pretty extreme long term planning in your register allocation code in the compiler, you're pretty likely to eventually wind up with some piece of information in a register that you need to move to another register before making or returning from a call.

Of course, modern processors don't really care about register-register moves because they have hundreds of hidden intermediate registers and register renaming which makes those types of moves "free." (e.g. Haswell has 168 GP registers, only 16 of which are exposed through the AMD64 ISA.)

link

gsg 4023 days ago

Note that you don't need a dumb compiler, register pressure, or an instruction with physical register constraints to require a copy. Two-address instructions suffice: simply have a register for the first operand that contains a value that is used later. A copy is required to preserve the value stomped by the mutating instruction.

In any case I believe the point of the GP is that any necessary copy can be done by going through memory, so register-register moves can be dispensed with where minimalism is the goal.

link

rdc12 4024 days ago

Do you have a citation for the 168GP registers, just wondering if Intel have "offically" stated the number they use in a given microarch.

link

pbsd 4024 days ago

You can find an official Intel source for Haswell in here (page 15): http://pages.cs.wisc.edu/~rajwar/papers/ieee_micro_haswell.p...

link

sp332 4024 days ago

I might be misreading this, but I think it's on slide 24 of this PDF http://www.hotchips.org/wp-content/uploads/hc_archives/hc25/...

link

pslam 4023 days ago

Yes, and these are effectively 3 different instructions sharing the same mnenomic. If you assemble them:

        mov     0x1000(%rax,%rax), %eax
        mov     %eax, 0x1000(%rax,%rax)
        mov     $7, %eax

You get:

        0000000000000000	8b840000100000  	movl	0x1000(%rax,%rax), %eax
        0000000000000007	89840000100000  	movl	%eax, 0x1000(%rax,%rax)
        000000000000000e	b807000000      	movl	$0x7, %eax

Which is basically 0x8b->Load, 0x89->Store, 0xb8->Constant.

link

vardump 4024 days ago

ARM mov can be used as a JMP <addr in reg>. Program counter is in fact register R15 in ARM.

link

StillBored 4023 days ago

This changed with V8, IIRC the PC can only be updated on branch or an exception...

link

jimmaswell 4024 days ago

If you're going down that path, all of x86 is used as a 'bytecode' that turns into microcode instructions anyway in modern processors.

link

pmalynin 4024 days ago

Not exactly, most instructions are not "microcoded" for performance reasons. But there is a class of instructions that is.

link

delroth 4024 days ago

Intel CPUs translate x86 to some internal RISC-like "uops" before doing optimizations and execution. This is a widely documented fact (though Intel doesn't really talk publicly about it as far as I know): see for example §2.1 in http://www.agner.org/optimize/microarchitecture.pdf

link

mattgodbolt 4023 days ago

For a (hopefully!) more accessible introduction to the crazy world of what really happens inside the x86, I did a talk at GOTO: http://youtu.be/hgcNM-6wr34 - it's a really fascinating subject!

link

userbinator 4024 days ago

For marketing reasons, Intel made quite a big commotion over the fact that its Pentium was "RISC-like" when it was first released, but in reality what it meant was using a decoder that could emit multiple uops in parallel instead of sequentially like it was in the 486.

Even CPUs considered RISC today, like ARMs, need to decode instructions into one or more wider uops for efficient execution.

link