Hacker News new | ask | show | jobs
by convolvatron 2 hours ago
or, we could just wait a hot second, get GPU and associated hardware over the 30% utilization mark, develop a fault tolerance strategy that recovers more useful work, and spend a bit more time researching how models actually converge. 50% savings on training time would mean even more energy savings because of the add-on effects of cooling.

this spending of billions just to get a 4 month lead, without even trying to invest in getting this stuff to run properly is wasteful to the point of insanity. I don't think it's at all productive to chide people for not wanting to dump their resources into a black hole.

it seems pretty clear that the investors and the AI companies _like_ to throw around big GW numbers. it gives them a moat, and it fuels the bubble.