| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by danaris 510 days ago
	This assumes no (or very small) diminishing returns effect. I don't pretend to know much about the minutiae of LLM training, but it wouldn't surprise me at all if throwing massively more GPUs at this particular training paradigm only produces marginal increases in output quality.

1 comments

tomrod 510 days ago

I believe the margin to expand is on CoT, where tokens can grow dramatically. If there is value in putting more compute towards it, there may still be returns to be captured on that margin.

link