|
|
|
|
|
by convolvatron
2 hours ago
|
|
or, we could just wait a hot second, get GPU and associated hardware over the 30% utilization mark, develop a fault tolerance strategy that recovers more useful work, and spend a bit more time researching how models actually converge. 50% savings on training time would mean even more energy savings because of the add-on effects of cooling. this spending of billions just to get a 4 month lead, without even trying to invest in getting this stuff to run properly is wasteful to the point of insanity. I don't think it's at all productive to chide people for not wanting to dump their resources into a black hole. it seems pretty clear that the investors and the AI companies _like_ to throw around big GW numbers. it gives them a moat, and it fuels the bubble. |
|