Hacker News new | ask | show | jobs
by joefourier 102 days ago
That depends what kind of ASIC you’re talking about. Cerebras can run models like GLM 4.7 with 355B parameters.
1 comments

Cerebras just uses SRAM instead of DRAM. An ASIC would instead hardwire the neural network.
Surely it's more of a spectrum? From a CPU, to a TPU, to a chip that hardwires softmax attention but lets you store arbitrary weights, to one that hardwires the weights directly.