Hacker News new | ask | show | jobs
by visarga 2138 days ago
> Most credible machine learning systems work well on unseen data, which by definition isn't memorizing.

Sorry, but no. ML models don't generalise well outside the training data, but they can interpolate inside. This question becomes very interesting in the case of GPT-3 which has had a huge corpus of text to train on, so it's probably seen 'everything'. It's still memorising for GPT-3 but also learning to manipulate data, like software algorithms.

1 comments

ML models don't generalise well outside the training data, but they can interpolate inside.

I'm unsure if you just misstated this or don't know, but this is wrong.

ML models don't generalise well on data outside the distribution of their training data. But that's an entirely different thing, and doesn't mean at all they are memorising data.

Imagine something training on the US unemployment rate until 2020 being hit with the COVID rate. It wouldn't know what to do, but that doesn't mean it wouldn't work fine on a rate of 5.342% even if it had never seen that rate before.

This is a simplified example, but applies to everything.

GPT-3 generation of text does pull from memorised training data. There's a lot of stuff going on there, and amongst other things there has never really been a system that does textual generation well. It's also hugely overparameterised, so lots of potential for overfitting. I don't think it's a good example of a "good" AI system - it's very interesting, full of potential, but there are lots of issues.