Hacker News new | ask | show | jobs
by Tuna-Fish 3302 days ago
The reason he gave is not the main reason L1s don't grow. The main reason is latency.

Increasing cache size grows latency in two ways: every doubling of the cache size adds one mux to the select path, and every doubling increases the wire delay from the most distant element by ~sqrt(2). Both of these are additively on the critical path, and spending more time on them would require increasing the cache latency.

The size of a cache is always a tradeoff against the latency of the cache. If this was not true, there would only be a single cache level that was both large and fast. However, making something both large and fast is impossible, so instead we have stacked cache levels starting with a very fast but small cache followed by increasingly slower and larger ones.

1 comments

Hi, it is the main reason L1 hasn't grown.

By your reasoning, no cache should be able to grow, because then their latency would increase too much. But instead, all other CPU caches are growing basically with iso-latency. The reason this is possible is technology scaling. Anyway...

But yes, the L1 does have to be small and fast, but it doesn't have to be that small to be that fast. It has to be that small because of virtual indexing combined with the cost of adding ways breaking other design constraints (possibly a latency constraint, fine). But you could grow the L1 by adding sets and get your required latency.

> By your reasoning, no cache should be able to grow, because then their latency would increase too much. But instead, all other CPU caches are growing basically with iso-latency. The reason this is possible is technology scaling. Anyway...

The problem is that technology scaling only gives you roughly enough to keep up with the speed of the CPU. Cache latencies measured as nanoseconds keep going down, but cache latencies measured in clock cycles are pretty stagnant at the same sizes. And when Intel added some more L3, they also relaxed latency to it, and when they recently cut the latency to it a little, they did so by cutting the amount.

> But yes, the L1 does have to be small and fast, but it doesn't have to be that small to be that fast. It has to be that small because of virtual indexing combined with the cost of adding ways breaking other design constraints (possibly a latency constraint, fine). But you could grow the L1 by adding sets and get your required latency.

No, you couldn't. The added latency of increasing the size would be simply too much. I know for a fact that the L1 load latency is currently one of the most important critical paths in Intel CPUs -- any increase in L1 size would mean that you have to reduce clock speeds.