|
|
|
|
|
by gmt2027
941 days ago
|
|
We have an algorithm and computational hardware that will tune a universal function approximator to fit any dataset with emergent intelligence as it discovers abstractions, patterns, features and hierarchies. So far, we have not yet found hard limits that cannot be overcome by scaling the number of model parameters, increasing the size and quality of training data or, very infrequently, adopting a new architecture. The number of model parameters required to achieve a defined level of intelligence is a function of the architecture and training data. The important question is, what is N, the number of model parameters at which we cross an intelligence threshold and it becomes theoretically possible to solve mathematics problems at a research level for an optimal architecture that we may not yet have discovered. Our understanding does not extend to the level where we can predict N but I doubt that anyone still believes that it is infinity after seeing what GPT4 can do. This claim here is essentially a discovery that N may be much closer to where we are with today's largest models. Researchers at the absolute frontier are more likely to be able to gauge how close they are to a breakthrough of that magnitude from how quickly they are blowing past less impressive milestones like grade school math. My intuition is that we are in a suboptimal part of the search space and it is theoretically possible to achieve GPT4 level intelligence with a model that is orders of magnitude smaller. This could happen when we figure out how to separate the reasoning from the factual knowledge encoded in the model. |
|