| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adastra22 189 days ago
	Again, memory bandwidth is pretty much all that matters here. During inference or training the CUDA cores of retail GPUs are like 15% utilized.

1 comments

Not for prompt processing. Current Macs are really not great at long contexts