Hacker News new | ask | show | jobs
by clauderoux 3 days ago
I have been a pretty consistent user of AI since 2022 (Instruct-GPT), so I don't have a bad opinion about the topic. However, I think the real problem now has become pretty obvious. We are hitting a reality wall, where we simply don't have enough ressources to feed the AI industry. We don't generate enough electrical power nor enough GPU or TPU. For the first time in computer science, the real issue here is the finitude of the physical world. Unless, we start digging asteroids, we are already facing a shortage of raw material and industrial output. In my opinion, the only way to go is small models running on regular hardwares.
2 comments

Aren't small local models worse efficiency-wise? It means that every person must have a powerful enough machine to power a small model, and we are very, very far away from that.

The best solution, from an efficiency point of view, is to use smaller models on datacenters, requiring much less of them.

There's an efficiency sweet spot where hardware that people have anyway gets a higher percentage of load.

MacBooks have a lot of memory and a lot of FLOPs. They mostly sit unused all day. Yes, the excess energy use will be higher than a GPU in a datacenter doing the same work, but you have to generate an absurd amount of tokens before the dollar-efficiency catches up with the MacBook.

You need to have a 3k dollar machine available though, I think you are overestimating how many people have access to it
The future is clearly that, model inference running on consumer hardware not a network datacenter. We are getting there, local models gets better and better, it's only a matter of time. Of course this would be bad news for big AI companies (and whoever invested in them).