|
|
|
|
|
by zer00eyz
244 days ago
|
|
If we make some massive physics breakthrough tommrow is an LLM going to be able to fully integrate that into its current data set? Or will we need to produce a host of documents and (re)train a new one in order for the concept to be deeply integrated. This distinction is subtle but lost on many who think that our current path will get us to AGI... That isn't to say we haven't created a meaningful tool but the sooner we get candid and realistic about what it is and how it works the sooner we can get down to the business of building practical applications with it. (And as an aside scaling it, something we arent doing well with now). |
|
The reason that the models don't learn continuously is because it's currently prohibitively expensive. Imagine OpenAI retraining a model each time one of its 800m users sends a message. That'd make it aware instantly of every new development in the world or your life without any context engineering. There's a research gap here too but that'll be fixed with time and money.
But it's not a fundamental limitation of transformers as you make it out to be. To me it's just that things take time. The exact same architecture will be continuously learning in 2-3 years, and all the "This is the wrong path" people will need to shift goalposts. Note that I didn't argue for AGI, just that this isn't a fundamental limitiation.