|
|
|
|
|
by Tuna-Fish
763 days ago
|
|
The path of loading data from L1 is one of the tightest, most timing-critical parts of a modern CPU. Every cycle of latency here has very clear, measurable impact on performance, and modern designs typically have 4-5 cycle L1 load-to-use. Current AMD cores do really well against Intel ones despite clocking lower and being weaker on most types of resources simply because they have a 1 cycle advantage. If you had literally infinite cheap transistors available, it would not be a good idea to spend them on the L1 cache, because this would make the cpu slower. > L1 cache avoids muxing as much as possible, which is why it takes up so much die space in the first place. Every time you double the size of a cache, you need to add a single extra mux on the access path. Simply to be able to select from which half of the cache you want the result. You also increase the distance that a signal needs to propagate, but I believe for L1 the muxes dominate. |
|