| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ssivark 52 days ago
	How about having a large pool of unified memory and expanding the next layer (L3?) of cache to accommodate more of the CPU's the low-latency RAM usage?

2 comments

marcosdumay 52 days ago

As a rule, increasing the size of cache increases its latency, and how much of it you can use is capped by the quality of your cache management algorithms and the latency of the level above it.

Since CPUs are highly optimized, both increasing the latency of the main memory and increasing the size of L3 will probably lead to larger L3 latency.

link

trumpdong 52 days ago

We might even decide to put 32GB of high-latency cache on the system board and then 12GB of throughput-optimized main memory close to the GPU. ;)

link

marcosdumay 52 days ago

You meant a 128GB (instead of 12GB)?

And yes, a L4 cache can be one way out of that problem. Another way is making the L3 cache lines wider and working the hell out of improving your management algorithm.

It's not a theoretically impossible problem. It's also not something you can solve automatically with a bit more money or some simple decisions. It's possible this is the best architecture available, but it's not certain by any means.

link

trumpdong 51 days ago

I mean 12GB, an amount that is typical in such a system today, which you can buy at any computer store.

link

saagarjha 51 days ago

Yeah but unfortunately I hear trying to get more than that is quite hard

link

marcosdumay 51 days ago

Oh, I entirely misunderstood your comment :)

link

Melatonic 51 days ago

I think that's basically what Cerebras doing ?

link