|
|
|
|
|
by zozbot234
2244 days ago
|
|
Traditional addressing modes have been largely abandoned because modern architectures are based on the load-store principle. Simplicity has little to do with it, and referring to that whole shift in design as "complex" vs. "simple" instruction sets is a bit of a misnomer. Besides, well-designed architectures are not exactly lacking in ease-of-use. |
|
First, mod-r/m addressing on x86 is fairly traditional and can often save considerable calculation over a "simpler" addressing mode (given the opportunities for add-and-scale operations).
Second, treating x86 machines as load/store architectures passes up the opportunity to achieve improved code density and increased execution bandwidth from "microfusion" - this is when a operation (e.g. "add") is done with a memory operand. Microfusion, for those not familiar with it, allows two "micro-ops" (aka uops) that originate from the same instruction to be "fused" - that is, issued and retired together (even though they are executed separately).
This can occasionally - in code that has already been militantly tuned to an inch of its life - yield speedups, as Skylake and similar can only issue and retire 4 uops per cycle. However, there are 8 execution ports (of which only 4 do traditional 'computation'). Carefully designed code can take advantage of the fact that issue/retire are in the "fused domain" while execute is "unfused domain" - so you can sometimes get 4 computations and 1 load per cycle even on a 4-issue machine.
I was trained on MIPS and Alpha, so of course old habits die hard, and it's always tempting to go old school and design everything to act as if the underlying machine is a load-store architecture. However, this (a) isn't necessary on x86 and (b) often won't be faster.
The other blow against load-store is that a modern o-o-o architecture can hoist the load and separate it from the use anyway - and it doesn't have to consume a named register to do it (it will use a physical register, of course, but x86 has way more physical registers than it has names for registers). This of course is a bigger deal for the rather impoverished register count of x86 so it is, in the words of a former Intel colleague on a different topic, a "cure for a self-inflicted injury".