Hacker News new | ask | show | jobs
by nickff 4 days ago
If you’re happy to use the current one forever, then yes. I was amending my comment above to address this when you posted yours.
1 comments

I think for many practical purposes the frontier open weight models are almost universally good enough for most things. There may be greater and greater frontiers but at q certain point it becomes like IQ. Having a 150 IQ doesn’t mean you’ll be more successful at any particular task over someone with a 125 IQ. Indeed there’s a diminishing return on intelligence on many utility functions where being more intelligent yields more be same or worse ultimate outcomes. It might very well be the person with a 150 IQ could understand some extraordinarily complex and esoteric concepts faster, but it doesn’t mean with more effort the 125 IQ person can’t either; and sometimes that extra time spent yields better outcomes overall.

I suspect AI will be somewhat similar where even if the linear scaling laws continue to hold the practical utility of a model flattens for almost all conceivable use cases.

In some ways I already feel this has begun to happen. The marginal utility of opus class models and fable has in my perception begun to flatten. While I can tell the differences they aren’t earth shattering. I could continue to use the present models for the rest of my life and be ludicrously more productive simply by adapting within their constraints through ever more sophisticated applications.

What holds back the open weights IMO is hardware scaling and industrial production. As the enormous transfer of wealth in debt and equity markets unfolds with semiconductor and adjacent companies and the corresponding capital investments are made, and the eventual bubble pop leading to over capacity and market flooding, as well as advances in technology, math, techniques, and efficiencies, will make very large open weight models more directly attainable. This will also lead to chimera models that MOE very large models to get very close to the 1-2T parameter dense models, at which point I suspect utility for almost all uses is nearly fully saturated.

There will be areas where more capable models are needed but they will be frontier models on frontier problems. This, IMO, is inevitable, and without some criminalization of weights (see the attempts to criminalize encryption algorithms in the 20th century and all the wonderful tshirts that emerged). It’ll be harder to print a trillion parameter model on a shirt but I’m sure someone will try, as will governments try to keep us in our boxes slaving for food coupons and basic rights like health care.