| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wizzwizz4 1111 days ago

> One problem with that kind of feedback is that it incentives the AI to make us think it solved the problem when it didn’t,

Supervised learning is: "here's the task" … "here's the expected solution" *adjusts model parameters to bring it closer to the expected solution*.

What you're describing is specification hacking, which only occurs in a different kind of AI system: https://vkrakovna.wordpress.com/2018/04/02/specification-gam... In theory, it could occur with feedback-based fine-tuning, but I doubt it'd result in anything impressive happening.

1 comments

circuit10 1111 days ago

Oh, that seems less problematic (though not completely free of problems), but also less powerful because it can’t really exceed human performance

link