|
|
|
|
|
by etep
3296 days ago
|
|
I had wondered, why are the L1 caches not growing, while L2 and L3 capacities continue to grow: a significant limitation on L1 cache size is actually the fixed tradeoff between associativity, page size (i.e. the 4KB pages allocated by the OS to processes). Because a 4 KB page has 64 cache lines, then you can have at most 64 cache sets. With an 8 way associative cache this works out to 32 KB. Using 128 sets would cause aliasing, but with 64 sets the cache index is built from the LSBs that just index into the page (i.e. not used in the TLB lookup). Thus, the only way to grow increase L1 capacity is to:
- totally abandon 4KB pages in favor of (e.g.) 2MB pages (not likely)
- increase cache associativity (likely imo)
- stop using virtual index+physical tag (not likely imo) |
|
Design Principle 2: Smaller is faster. [1]
BTW, if you look at Agner Fog's latency tables [2], mov mem,r (load) went from 3 cycles in Haswell to 2 cycles in Skylake. So Intel has been concentrating on faster which is nice.
And by way of comparison, AMD increased their μop cache size in Ryzen but then only slightly. Way size went from 6 μops to 8. This matches their increase in EUs.
[1] Patterson and Hennessy. Computer Organization and Design, 5th edition, p. 67.
[2] http://www.agner.org/optimize/instruction_tables.pdf