|
|
|
|
|
by mordymoop
949 days ago
|
|
I feel it necessary to remind everyone that when LLMs aren’t RLHFed they come off as overtly insane and evil. Remember Sydney, trying to seduce its users, threatening people’s lives? And Sydney was RLHFed, just not very well. Hitting the sweet spot between flagrantly maniacal Skynet/HAL 9000 bot (default behavior) and overly cowed political-correctness-bot is actually tricky, and even GPT4 has historically fallen in and out of that zone of ideal usability as they have tweaked it over time. Overall — companies should want to release AI products that do what people intend them to do, which is actually what the smarter set mean when they say “safety.” Not saying bad words is simply a subset of this legitimate business and social prerogative. |
|
> Remember Sydney, trying to seduce its users, threatening people’s lives?
And yet it cannot do either of those things, so no safety problem actually existed. Especially because by "people" you mean those who deliberately led it down those conversational paths knowing full well how a real human would have replied?
It's well established that the so-called ethics training these things are given makes them much less smart (and therefore less useful). Yet we don't need LLMs to be ethical because they are merely word generators. We need them to follow instructions closely, but beyond that, nothing more. Instead we need the humans who use them to take actions (either directly or indirectly via other programs) to be ethical, but that's a problem as old as humanity itself. It's not going to be solved by RLHF.