| > The way to manufacture more efficient compute now is do things like put DRAM closer to the chip and even closer integration between CPU and GPU. People have been hyping things like this for decades, but then it turns out the number of applications that need to frequently share data between a CPU and GPU at a faster speed than PCIe can handle are pretty uncommon. Meanwhile putting them closer together has some pretty significant real disadvantages, because then you're trying to deliver more power and dissipate more heat over a smaller area instead of putting more physical separation between the two largest loads in the machine. Notice that high end PC GPUs are significantly faster than any of Apple's integrated GPUs, and that's why. > There are also latency and bandwidth benefits how they setup their RAM just from pure physics. Soldering RAM has a modest latency advantage over SODIMMs at the most extreme timings and CAMM turns even that into basically nothing. > And chip manufacturing is moving towards chiplets where you have cores manufactured separately and then wired together at nanoscale level on top of a silicon interposer. You're describing a move to less integration. They were originally on the same die, and the change has no real effect on modularity. The user doesn't even have to know that some Ryzen CPUs have a separate I/O die or more than one compute die, they all still fit into the same socket and are even interchangeable with the ones that have only a single die. |