Hacker News new | ask | show | jobs
by CuriouslyC 742 days ago
Yes and no. We don't need an insane amount of data to make these models accurate, if you have a small set of data that includes the benchmark questions they'll be "quite accurate" under examination.

The problem is not the amount of data, it's the quality of the data, full stop. Beyond that, there's something called the "No Free Lunch Theorem" that says that a fixed parameter model can't be good at everything, so trying to make a model smarter at one thing is going to make it dumber at another thing.

We'd be much better off training smaller models for specific domains and training an agent that can use tools deepmind style.

1 comments

> The problem is not the amount of data, it's the quality of the data, full stop. Beyond that, there's something called the "No Free Lunch Theorem" that says that a fixed parameter model can't be good at everything, so trying to make a model smarter at one thing is going to make it dumber at another thing.

My understanding is NFL only applies if the target function is chosen from a uniform distribution of all possible functions — i.e. the "everything" that NFL says you can't predict is more like "given this sequence from a PRNG (but we're not telling you which PRNG), infer the seed and the function" and less like "learn all the things a human could learn if only they had the time".