Hacker News new | ask | show | jobs
by anon373839 459 days ago
Yeah. Scaling up pretraining and huge models appears to be done. But I think we're still advancing the frontier in the other direction -- i.e., how much capability and knowledge can we cram into smaller and smaller models?