| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by josmala 2086 days ago

If we split the cycle time in two components, one is transistor and one is interconnect. Interconnect delay per length increases as much as the length decreases in each shrink. The transistor delay halves. That halving got transistor delay below the interconnection delay, so doubling that got huge improvements. Also for IPC there is diminishing returns, and easily available improvements got eaten early, and everyone is chasing diminishing returns. The Pentium PRO brought OoO and went from 2 to 3 decode and from 2 to 4 micro-ops/second. Core 2 went from 1 to 2 FP pipelines. Everyone is hitting the same issues, O(n^2) power and die costs for some structures widening things, but smaller fraction of code getting improved by widening. But it isn't all, it also increases latency which is either lower clocks or increased branch missprediction penalty.

Improving CPU:s has become harder simply because all the easier things have been done, and cost of improving one metric often causes worsening another thing slightly.

edit: Just to add, it isn't impossible to improve, but it's just has become harder. And this knowledge was part of reason why until I saw actual benchmarks I wasn't too interested in upgrading. I just found out the one thing I cared had gotten much better during that period.