| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cubefox 106 days ago
	ASICs only work for very small and heavily quantized models. Moreover, they are fixed function hardware, so whenever you have a new model, you have to throw the current chips away and design and buy new ones. That's like buying a new CPU every time a new OS version comes out.

3 comments

lumost 106 days ago

The latest strategies of etching weights into silicon seem like they can be generalized. We currently design gpu/tpu caching on the basis that the weights change frequently - if the weights do not change at all, or change very slowly - then there are other perhaps more efficient ways of laying out the memory on the chip which are somewhere between permanently etch a model onto silicon and use GPUs designed for graphics computation.

link

intrasight 106 days ago

I'm assuming that they will do a silicon etching run once a year. Might be an interesting acquisition opportunity for Apple since that's the rhythm of their device release.

link

lumost 106 days ago

It's a good point, it would be a nice "upgrade story" to get the next generation model. At a fixed cost of ~$1000 per model, it wouldn't be a bad deal relative to current api costs.

link

cubefox 106 days ago

That would be something like an FPGA. Which have been very unpopular so far due to high cost. And they also only support a relatively small number of weights.

link

joefourier 106 days ago

That depends what kind of ASIC you’re talking about. Cerebras can run models like GLM 4.7 with 355B parameters.

link

cubefox 106 days ago

Cerebras just uses SRAM instead of DRAM. An ASIC would instead hardwire the neural network.

link

joefourier 105 days ago

Surely it's more of a spectrum? From a CPU, to a TPU, to a chip that hardwires softmax attention but lets you store arbitrary weights, to one that hardwires the weights directly.

link

surfmike 106 days ago

Google’s training and running all their stuff on ASICs, seems to be working out well.

link

r_lee 106 days ago

they're TPUs, same thing as GPUs but specifically for tensor ops.

link

cubefox 106 days ago

TPUs are not ASICs if they can execute arbitrary models.

link

AdamN 106 days ago

Forgive my ignorance but wouldn't a TPU be a kind of ASIC where the application is model inference? The TPU Wikipedia article also says it's an ASIC - we should update it if it's wrong.

link

cubefox 106 days ago

In the limit, even a CPU could be called an ASIC because certain algorithmic operations (ALU etc) are implemented in hardware. CPU/ASIC are really poles of a gradient, with a CPU implementing very little in hardware and most in software, while an ASIC has very little software and lots of hardware. A TPU is presumably in between. I would argue however that it is closer to a GPU than to a full-blown ASIC, because the weights are stored in memory only, making them software.

link