|
|
|
|
|
by joe_the_user
1219 days ago
|
|
Unhinged Bing reminds me of a more sophisticated and higher-level version of getting calculators to write profanity upside down: funny, subversive, and you can see how prudes might call for a ban. With all due respect, that seems very strained as an analogy - it's not a bug but a strange human interpretation of expected behavior. You could at least compare it to Microsoft Tay, the chatbot which tweeted profanity just because people figure out ways to get it to echo input. But I think one needs such a non-problem as "some people think it means something it clearly doesn't" to not see the real problem of these systems. I mean, just "things that echo/amplify" by themselves are a perennial problem on the net (open email servers, IoT devices echoing packets, etc). And more broadly "poorly defined interfaces" are things people are constantly hacking in surprising ways. The thing is, Bing Chat almost certainly has instructions not to say hostile things but these statements being spat out shows that these guidelines can be bypassed, both accidentally and on purpose (so they're in a similar class to people getting internal prompts). And I would this is because an LLM is a leaky, monolithic application where prompt don't really acts as a well-defined API. And that's not unimportant at all. |
|
As one sample point, I've been using Bing for a couple of days now for real searches, and over dozens of actually-intentioned searches, it has never once tried to tell me what it really thinks of itself, it has never even made a reference to me, to say nothing of anything degrading towards me.
If you use Bing Chat in practice, you'll find that all the edge cases are engineered. Much like if you use a calculator in practice, it almost always doesn't say 55378008 or display porn (versus if you were angling for that, or run porn.89z).