| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jph00 303 days ago
	That's not widely true. E.g the GPT 4 tech report pointed out nearly all their experiments were done on models 1000x smaller than the final model.

1 comments

Fair point, though I’d argue that there’s inherent selection bias for improvements that could fit a scaling law curve in the small model regime here.