| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by exabrial 117 days ago
	I feel like we need an entirely new type of silicon for LLMs. Something completely focused on bandwidth and storage probably at the sacrifice of raw computation power.

1 comments

Something like this? (Llama 3.1-8B etched into custom silicon delivering 16,000 tok/s, doesn't use much PCIe bandwidth):

Wowsa that’s amazing! Exactly what I was imagining. To do that with 2500 watts is incredible.