| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by adsfgiodsnrio 1147 days ago
	"Supervised" does not mean the models need babysitting; it refers to the fundamental way the systems learn. Our most successful machine learning models all require some answers to be provided to them in order to infer the rules. Without being given explicit feedback they can't learn anything at all. Humans also do best with supervised learning. This is why we have schools. But humans are capable of unsupervised learning and use it all the time. A human can learn patterns even in completely unstructured information. A human is also able to create their own feedback by testing their beliefs against the world.

1 comments

circuit10 1147 days ago

Oh, sorry, I’m not that familiar with the terminology (I still feel like my argument is valid despite me not being an expert though because I heard all this from people who know a lot more than me about it). One problem with that kind of feedback is that it incentives the AI to make us think it solved the problem when it didn’t, for example by hallucinating convincing information. That means it specifically learns how to lie to us so it doesn’t really help

Also I guess giving feedback is sort of like babysitting, but I did interpret it the wrong way

link

wizzwizz4 1147 days ago

> One problem with that kind of feedback is that it incentives the AI to make us think it solved the problem when it didn’t,

Supervised learning is: "here's the task" … "here's the expected solution" *adjusts model parameters to bring it closer to the expected solution*.

What you're describing is specification hacking, which only occurs in a different kind of AI system: https://vkrakovna.wordpress.com/2018/04/02/specification-gam... In theory, it could occur with feedback-based fine-tuning, but I doubt it'd result in anything impressive happening.

link

circuit10 1147 days ago

Oh, that seems less problematic (though not completely free of problems), but also less powerful because it can’t really exceed human performance

link