Hacker News new | ask | show | jobs
by rrsp 955 days ago
We’re 5+ years from the transformer and we’re still using the transformer for the most cutting edge llms. I don’t see what difference another 5 year is going to make unless someone invents something new that can surpass the transformer, and given the amount of money and resources that has been put into AI since 2017 and the lack of innovation since (in terms of fundamental architecture, not things like Lora and Rope) then I’d say the chances are way way lower than 50%.
1 comments

Who says you need anything other than the transformer? We've clearly not squeezed all the capability out of it.
I don’t think it’s so clear. The transformer has been available for 6 years, if it were possible to train one to achieve AGI then what’s stopped anyone from doing this that won’t still be the case in 5 years time, given than there’s potentially ?trillions on the table for anyone that does.
I don't understand what you're saying. People have been been training up transformers in the goal of "achieving agi". Transformers have been getting better as they've been trained up. Nobody has stopped doing this.
But they haven't achieved AGI, not even close. It can't distinguish between truth and nonsense. An LLM is essentially outputting nonsense all the time, that has been massaged by training to approximate truth through the proxy of likely-next-word.
What I’m saying is if it is possible to train transformers to achieve AGI, then why hasn’t it happened yet? What’s the limitation that will be overcome in the next 5 years?
Because training takes time (months), money and hardware. It's not like this is some instantaneous process.

Nobody has any knowledge of the "magic number" of size and data before "AGI" so people train increasingly large models.

Bigger models are in the process of being trained. They will continue to be until they no longer get better.