| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by alecco 668 days ago
	If Cerebras keeps improving it will be a decent contender to Nvidia. Nvidia VRAM-SRAM is a bottleneck. For just inference, it needs to download a model at least once per token (divided by batch size). The bottleneck is not Tensor Cores but memory transfers. They say it themselves. Cerebras fixes that (at a cost of software complexity and narrower target solution).