|
|
|
|
|
by snvzz
1360 days ago
|
|
>Variable length instruction coding for instance, which means a surprising amount of power is dedicated to circuitry which is just to find where the instruction boundaries are for speculative execution. This does apply to x86 and m68k, as "variable" there means 1-16 byte, and dealing with that means bruteforcing decode at every possible starting point. Intel and AMD have both thus found 4-wide decode to be a practical limit. It does not apply to RISC-V, where you get either 32bit or 2x 16bit. The added complexity of using the C extension is negligible, to the point where if a chip has any cache or rom in it, using C becomes a net benefit in area and power. Therefore, ARMv8 AArch64 made a critical mistake in adopting a fixed 32bit opcode size. A mistake we can see in practice when looking at the L1 cache size that Apple M1 needed to compensate for poor code density. L1 is never free. It is always *very* costly: Its size dictates area the cache takes, clocks the cache itself can achieve (which in turns caps the speed of the CPU), and power the cache draws. |
|