| HN Mirror

They use separate memory servers, networked memory adjacent adjacent compute with small amounts of fast local memory.

Waferscale severely limits bandwidth once you go beyond SRAM, because with far less chip perimeter per unit area there is less place to hook up IO.