Hacker News new | ask | show | jobs
by goethes_kind 584 days ago
We got so far with transformers and huge amounts of compute and huge amounts of data. If we can get an architecture that is able to extract a slightly higher order of reasoning it will have a cascading effect when we apply the same level of compute and data. I see lots of potential for progressive improvement in this direction. Problem is, it's quite expensive to develop and test new architectures, but that's were the financial capital comes in.
2 comments

You can think of each of those as a bottleneck. The architecture (LLMs, transformers) was once the bottleneck, as was the amount of compute. From what I know the new bottleneck is the amount of quality data. Actually there was a breakthrough there too, because GPTs don't need supervised training.
Issue with AI is that most human tampering with the algorithms makes the end result worse, not better, it is really hard to get that tiny improvement.