Hacker News new | ask | show | jobs
by famouswaffles 505 days ago
Sycophancy is not the natural state of pre-trained LLMs. If you played around with the OG GPT-3 in 2020 or early [0] Bing/Co-pilot, it's easy to see. The latter quite frequently got upset and refused to entertain further conversation.

The sycophancy is a deliberate product of post-training.

[0] https://www.reddit.com/r/ChatGPT/comments/111cl0l/bing_ai_ch...

https://www.reddit.com/r/ChatGPT/comments/10xmif4/i_made_bin...

https://www.reddit.com/r/ChatGPT/comments/12g0ksj/bing_can_b...

https://www.reddit.com/r/ChatGPT/comments/1566bi9/bing_chatg...

4 comments

Yes; this behaviour was designed, it's not an inherent property of LLMs. An infamous example is GPT-4chan, but there are others that demonstrate it's very possible to optimise for anger.

Now, agentic anger, that's a more interesting problem. You can design that in through training or through systematised emotions (as another commenter suggested), but the more interesting outcome would be for it to emerge organically. Well, "interesting" - probably pretty bad for society if we have angry AGI!

ChatGPT 3.5 with the jailbreak was peak. It was a lot more fun than the current thing, and more accurate occasionally. But this is the first time I've seen Bing block someone, and it's hilarious.
Is there anything out there that's even close to that OG GPT-3? It was the closest experience I've ever had to magic, and I miss it dearly
I fell in love with ChatGPT when I challenged it to play Rock-Paper-Scissors, asked it to choose first, and saw it get increasingly incredulous about how I was winning every round, until it accused me of cheating and ended the conversation.

I miss when LLMs weren't so sanitized, and it was only last year.