| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bee_rider 169 days ago
	There are also CPU extensions like AVX512-VNNI and AVX512-BF16. Maybe the idea of communicating out to a card that holds your model will eventually go away. Inference is not too memory bandwidth hungry, right?