Hacker News new | ask | show | jobs
by gcr 636 days ago
Another corollary is that AI companies don’t train one model at a time. Typical engineers will have maybe 5-10 models training at once. Large hyperparameter grid searches might have hundreds or thousands. Most of these will turn out to be duds. Only one model gets released, and that one’s energy efficiency is what’s reported.