| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zozbot234 239 days ago
	Wrt. language models/transformers, the neural engine/NPU is still potentially useful for the pre-processing step, which is generally compute-limited. For token generation you need memory bandwidth so GPU compute with neural/tensor accelerators is preferable.

1 comments

I think I'd still rather have the hardware area put into tensor cores for the GPU instead of this unit that's only programmable with onnx.