Hacker News new | ask | show | jobs
by magicalhippo 851 days ago
We're already half-way in a heterogeneous future, with chiplets[1] and mixed cores[2][3] etc. Could we expand this to memory, having some soldered (on-chip?) high-speed memory, and then slots for additional slower, yet faster then the alternatives, DIMMs?

Or would the cost of the extra complexity of the memory controller likely not be worth it ever?

[1]: https://www.anandtech.com/show/13560/amd-unveils-chiplet-des...

[2]: https://www.intel.com/content/www/us/en/gaming/resources/how...

[3]: https://en.wikipedia.org/wiki/ARM_big.LITTLE

3 comments

> Could we expand this to memory, having some soldered (on-chip?) high-speed memory, and then slots for additional slower, yet faster then the alternatives, DIMMs?

Intel's already doing that with Xeon Max, it has both onboard HBM and an outboard DDR5 interface. It can be configured to run entirely from HBM with no DDR5 installed at all, or use the HBM as a huge cache in front of the DDR5, or to map the HBM and DDR5 into different memory regions to let software decide how to use each. I don't think there's been any indication of that approach filtering down to consumer architectures though, Intel is talking about doing RAM-on-package there but without any outboard memory interface alongside it.

Obviously high-end consumer CPUs already have about 30MB of on-chip memory, with server CPUs reaching a solid 300MB. We just prefer to call it L2 and L3 cache. If we add more memory in a chiplet format I suspect mainstream CPUs would simply expose (or rather hide) it as L3 or L4 cache.

Most software isn't even NUMA aware, and would completely fail to take advantage of a tiered memory hierarchy if it was given the option. But if we make the fast memory a big cache and let the CPU worry about it it's a "cheap" win.

Though there is the Xeon Phi which has about 16GB of on-package memory that can either be configured as cache or as "scratchpad" memory. But of course that's not meant for general-purpose software

> Obviously high-end consumer CPUs already have about 30MB of on-chip memory,

AMD 7950X3D, a desktop CPU, has 144 MB of L2+L3 cache memory on-chip.

to do this, I assume they're gonna need to stop die shrinks and drastically improve yields.

the reason to separate all the components are to ensure high percentage of functional pieces

Chiplets help with yield a lot.