|
|
|
|
|
by pcwalton
2665 days ago
|
|
No, x86-64 is just as inefficient as AArch64 these days, because of all the REX prefixes. Almost every instruction needs one and it bloats the size tremendously, to the extent that most x86-64 instructions are just about 4 bytes. Measure binary sizes if you don't believe me. |
|
That's only if your code happens to be particularly "64-bit-heavy", or the compiler isn't doing a good job at selecting registers; the original designers (at AMD, not Intel) decided on the prefices (and defaulting to 32-bit for most ops) instead of defaulting all operations to 64-bit in 64-bit mode precisely because it would be better for size and performance --- using their carefully optimised compilers. Plus, what can be done with a single 4-byte instruction on x86 can require multiple 4-byte ARM instructions, and that adds up quickly.
I can't find it at the moment but one of the studies I remember comparing the binary sizes was using GCC, which is widely available and free, but probably one of the worst compilers at x86 size optimisation I've seen. I even recall a remark in that study about how it was generating mostly RISC-like instructions, so in other words they were comparing binaries generated for a RISC CPU using a RISC-oriented compiler with ones generated for a CISC CPU using a RISC-oriented compiler, failing to exploit the full capabilities of a CISC.
I've written x86 Asm for several decades (started with 16-bit --- dating myself here...), and done some occasional MIPS and ARM, and it's very difficult for me to believe that the RISCs have any intrinsic advantage in code density other than the fact that compilers for x86 aren't that great at it; you can write a Fibonacci calculator for the latter in 5 bytes and pushes and pops are single-byte instructions, while on the former even a register-register move is 4 bytes.