Hacker News new | ask | show | jobs
by Ldorigo 338 days ago
The data might be the limiting factor of current transformer architectures, but there's no reason to believe it's a general limiting factor of any language model (e.g. humans brains are "trained" on orders of magnitude less data and still generally perform better than any model available today)
1 comments

That depends on whether these current learning models can really generalise or whether they can only interpolate within their training set