| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by rodoxcasta 1055 days ago
	For inference, at least locally, the bottleneck is usually the memory bandwidth (and quantity, of course). I hope that AI hype lead us to more memory and more memory bandwidth, because they are really lagging behind computer power increase from like 15 years already.

1 comments

sbrother 1055 days ago

Oh, 100%. But you can do some pretty amazing things with fine-tuning LLMs too, and that is very compute intensive. Not to mention it's ridiculously hard even getting access to a cloud GPU instance nowadays.

link