Hacker News new | ask | show | jobs
by RandomBK 1177 days ago
Plus there's bound to be false starts, reverts, crashes, etc that bump up the actual reproduction cost. Most training cost estimations take an extremely rosy best-case view assuming everything goes smoothly on the first try and no gpu cycles were wasted.