> It was only in 2025, as memory prices began an unprecedented surge, that the memory makers started to build new fabs targeted at HBM, all slated to start producing chips in 2027 or 2028.
The RAM shortage is predicated on both the huge datacenter buildout (many of which are already mired in delays, with a few even cancelled outright), and the massive memory purchase commitments various hyperscalers have made - hyperscalers who seem to be running short on cash lately...
History? This isn't the first RAM shortage. When one happens, producers build more fabs. The fabs come online, the availability of memory shoots up, and the shortage goes away, usually replaced by a glut.
If you want to argue that this is different from all previous RAM shortages, you can, but the burden of proof is on you to show the difference.
There is certainly economic pressure to create an exponential demand for tokens, but we've already seen a pullback from the costly "token maxing" companies were pushing last year.
It's also pretty unclear to what degree the RAM shortage is driven by inference (versus by training). We're rapidly approaching the point where frontier models are "good enough" for everyday use, are at some point we're going to hit diminishing returns on training new trillion-parameter models...