Hacker News new | ask | show | jobs
by girvo 548 days ago
That came, IIRC, from one of the OpenAI or Microsoft people (Sebastian Bubeck); it was recounted in an NPR podcast "Greetings from Earth"

https://www.thisamericanlife.org/803/transcript

1 comments

It's in this presentation https://www.youtube.com/watch?v=qbIk7-JPB2c

The most significant part I took away is that when safety "alignment" was done the ability plummeted. So that really makes me wonder how much better these models would be if they weren't lobotomized to prevent them from saying bad words.