|
|
|
|
|
by userbinator
3929 days ago
|
|
The problem is that a simpler decoder doesn't compensate for the extra instruction cache needed to achieve the same hit rates/levels of performance, and that is bad for power efficiency since L1 cache needs to run at full core speed and in modern CPUs there's vastly more transistor area in the cache than the decoder. The increased memory traffic from lower hit rates also doesn't help. This article shows that effect quite clearly: http://www.extremetech.com/extreme/188396-the-final-isa-show... The x86s have 32K of L1 icache, the ARMs 32K or 16K, and the MIPS Loongson has 64K. Also, the Loongson does not support MIPS16 whereas the ARMs all support Thumb. If you look at the total energy consumed, the MIPS is noticeably worse than x86 or ARM: http://www.extremetech.com/wp-content/uploads/2014/08/Averag... In fact, the cache takes so much power that Intel engineers have found it profitable to turn off parts of the cache when in low-power modes; this feature is called Dynamic Cache Sizing and appears in the later Atom series. |
|
It's not that simple. Dynamic power depends on the toggle rate of the flip-flops and the electrical capacitance of the fan-out wires and gates, not on the number of transistors. In a cache, very few storage elements change their state in every cycle, while the decoder performs a lot of work in every cycle.