Hacker News new | ask | show | jobs
by adsfgiodsnrio 1101 days ago
"Supervised" does not mean the models need babysitting; it refers to the fundamental way the systems learn. Our most successful machine learning models all require some answers to be provided to them in order to infer the rules. Without being given explicit feedback they can't learn anything at all.

Humans also do best with supervised learning. This is why we have schools. But humans are capable of unsupervised learning and use it all the time. A human can learn patterns even in completely unstructured information. A human is also able to create their own feedback by testing their beliefs against the world.

1 comments

Oh, sorry, I’m not that familiar with the terminology (I still feel like my argument is valid despite me not being an expert though because I heard all this from people who know a lot more than me about it). One problem with that kind of feedback is that it incentives the AI to make us think it solved the problem when it didn’t, for example by hallucinating convincing information. That means it specifically learns how to lie to us so it doesn’t really help

Also I guess giving feedback is sort of like babysitting, but I did interpret it the wrong way

> One problem with that kind of feedback is that it incentives the AI to make us think it solved the problem when it didn’t,

Supervised learning is: "here's the task" … "here's the expected solution" *adjusts model parameters to bring it closer to the expected solution*.

What you're describing is specification hacking, which only occurs in a different kind of AI system: https://vkrakovna.wordpress.com/2018/04/02/specification-gam... In theory, it could occur with feedback-based fine-tuning, but I doubt it'd result in anything impressive happening.

Oh, that seems less problematic (though not completely free of problems), but also less powerful because it can’t really exceed human performance