Hacker News new | ask | show | jobs
by goethes_kind 591 days ago
With all the capital behind these LLM companies, I would be surprised if we don't see any architectural improvements that lead to better reasoning. Training bigger and bigger models is clearly not sustainable, so I'm sure all of them are working in this direction. Correct me if I'm wrong, but this is the first time that AI has so much financial capital behind it.
1 comments

You could have poured billions into developing space flight in the 1800s, but it doesn't mean you'd get a working spaceship once the funds ran out.

I suspect we're in a similar situation with AGI.

We got so far with transformers and huge amounts of compute and huge amounts of data. If we can get an architecture that is able to extract a slightly higher order of reasoning it will have a cascading effect when we apply the same level of compute and data. I see lots of potential for progressive improvement in this direction. Problem is, it's quite expensive to develop and test new architectures, but that's were the financial capital comes in.
You can think of each of those as a bottleneck. The architecture (LLMs, transformers) was once the bottleneck, as was the amount of compute. From what I know the new bottleneck is the amount of quality data. Actually there was a breakthrough there too, because GPTs don't need supervised training.
Issue with AI is that most human tampering with the algorithms makes the end result worse, not better, it is really hard to get that tiny improvement.