| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by alister 531 days ago

> short stories generated by GPT-3.5 and GPT-4 to train LMs that are smaller

The loop of development is fascinating:

Millions of humans write literature, Wikipedia, etc.

Large language models are trained on that body of work.

Now large language models generate training data for small language models.

What's the next iteration? A talking Buzz Lightyear toy with one of those small language models that'll teach (human) infants to talk?

3 comments

ocean_moist 531 days ago

This is actually a common pattern called "model distilling".[0]

[0] https://platform.openai.com/docs/guides/distillation

link

nickpsecurity 531 days ago

I thought that, too. It wasn’t really true, though.

Some papers pointed out that the models start failing after being trained with too much synthetic data. They also need tons of random, Internet data in the first place. Humans don’t have those failure modes. The AI’s also got smarter the more data we produced.

So, there’s some critical differences between what we’re doing and what they’re doing that keep it from being a neat flow like that. What many humans do in training other humans fits that, though.

link

visarga 531 days ago

> A talking Buzz Lightyear toy with one of those small language models that'll teach (human) infants to talk?

Great idea. I was thinking more like a plushie toy with sensors, it would react to touch, sight and speech. I would run the models locally from a computer, keep the toy just lightweight I/O.

link