Hacker News new | ask | show | jobs
by ulnarkressty 982 days ago
Most of the improvements apparently come from training larger models with more data. Which is part of the problem mentioned in the article - the probability that the model just memorizes the answers to the tests is greatly increased.

AI is getting subjectively better, and we need better tests to figure out if this improvement is objectively significant or not.

4 comments

> Most of the improvements apparently come from training larger models with more data.

OpenAI is reportedly losing 4 cents per query. With a thousandfold increase in model size, and assuming linear scale in cost, that's a problem. Training time is going to go up too. Moore's law isn't going to help any more. Algorithmic improvements may help...if any significant ones can be found.

That’s backwards.

Training a model on more data improves generalization not memorization.

To store more information in the same number of parameters requires the commonality between examples to be encoded.

In contrast, the less data trained on, especially if repeated, lets the network learn to provide good answers for that limited set without generalizing. I.e. memorizing.

——

It’s the same as with people. The more variations people see of something, the more likely they intuit the underlying pattern.

The fewer examples, the more likely they just pattern match.

> It’s the same as with people. The more variations people see of something, the more likely they intuit the underlying pattern.

> The fewer examples, the more likely they just pattern match.

A kid who uses a calculator and just fills in the answer to every question will see a lot more examples than a kid that learned by starting from simple concepts and understanding each step. But the kid who focused on learning concepts and saw way fewer problems will obviously have a better understanding here.

So no, you are clearly wrong here, humans doesn't learn that way at all. These models learn that way, you are right on that, but humans don't.

I have no idea where your calculator came from.

In neither case did I introduce one.

And since the calculator itself has already a general understanding, it would seem completely counter productive to start training a computer or child by first giving them a machine that has already solved the problem.

Also, for what it’s worth, I am speaking from many years experience not just training models but creating the algorithms that train them.

Replace "uses calculator" to "looks through solved problems", same thing. Not sure what you don't understand. Humans don't build understanding by seeing a lot of solved examples.

To make a human understand we need to explain how things work to them. You don't just show examples. A human who is just shown a lot of examples wont understand much at all, even if he tries to replicate them.

> Also, for what it’s worth, I am speaking from many years experience not just training models but creating the algorithms that train them.

What does this has to do with how humans learn?

Humans learn vast amounts of information from examples.

They learn their first words, how to walk, what a cat looks like from many perspectives, how to parse a visual scene, how to parse the spoken word, interpret facial expressions and body language, how different objects move, how different creatures behave, different materials feel, what things cause pain, what things taste like and how they make them feel, how to get what they want, how to climb, how not to fall, all by trial & example. On and on.

And yes, as we get older we get better and better at learning 2nd hand from others verbally, and when people have the time to show us something, or with tools other people already invented.

Like how a post-trained model picks up on something when we explain it via a prompt.

But that is not the kind of training being done by models at this stage. And yet they are learning concepts (pre-prompt) that, as you point out, you & I had to have explained to us.

> Like how a model picks up on when we explain something to it after it has been trained.

Models don't learn by you telling them something, the model doesn't update itself. A human updates their model when you explain how something works to them, that is the main way we teach humans. Models don't update themselves when we explain how something works to them, that isn't how we train these models, so the model isn't learning its just evaluating. It would be great if we could train models that way, but we can't.

> Humans learn vast amounts of information from examples.

Yes, but to understand things in school those examples comes with an explanation of what happens. That explanation is critical.

For example, a human can learn to perform legal chess moves in minutes. You tell them the rules each piece has to follow and then they will make legal moves in almost every case. You don't do it by showing them millions of chess boards and moves, all you have to do is explain the rules and the human then knows how to play chess. We can't teach AI models that way, this makes human learning and machine learning fundamentally different still.

And you can see how teaching rules creates a more robust understanding than just showing millions of examples.

I pretty much want the LLM to be great at memorizing things. That's what I'm not great at.

If it had perfect recall I would be so thrilled.

And just because it's memorized the data--as all intelligences would need to do to spit data out--doesn't mean it can't still do useful operations on the data, or explain it in different words, or whatever a human might do with it.

Do we? I use gpt-4 daily and it matters not to me what the source of the "intelligence" is. It's subjective what "intelligence" even means. It's subjective how the brain works. Almost by definition AI is "things that can't be objectively measured".