| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cgdl 336 days ago
	I'd say llm inference requires both memory capacity and bandwidth. Cerebras provides bandwidth with on-chip SRAM, but not capacity (an entire wafer has only 44GB SRAM).