| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by snordgren 915 days ago
	GPT-4 (in)famously tricked a human to do a captcha for it. The current GPT-4 with vision would probably have been able to do it without the human, but maybe it has been “gaslit” by all the content online saying that only humans can solve captchas, that it doesn’t consider it?

2 comments

stavros 915 days ago

I really doubt that GPT-4 had the "will" to do anything. Someone must have asked it to "want" to trick a user.

link

JimDabell 915 days ago

It’s from here: https://cdn.openai.com/papers/gpt-4.pdf (search for "CAPTCHA"). It was an artificial exercise that got massively exaggerated. It was explicitly instructed to do nefarious things like lie to people, it didn’t do those things of its own accord.

link

IIAOPSW 915 days ago

When I ask it to lie to me, it says its sorry but as an online AI language model it would be unethical...but when I ask it to tell me a story its happy to comply.

link

krisoft 915 days ago

Well that is just how human communication works.

If I tell you that I watched C-beams glitter in the dark near the Tannhäuser Gate that is a lie. If I write the same in fiction I receive accolades.

If I tell you on the street “watch out there is a T-rex about to eat you!” That is a lie. If i say the same thing sitting at a table with too many dice that is just acceptable DMing and everyone rolls initiative.

Humans are weird this way.

link

latexr 915 days ago

It feels like you left out context, otherwise what’s the problem? Do you get mad at fiction authors for lying to you when you read their books? Or are you OK if someone lies to your detriment then later says “I was just telling a story, bro, but with us as the characters and without explaining it was a story”?

link

IIAOPSW 914 days ago

I suppose my point is that the rules which openAI attempts to impose on what their AI should and shouldn't be allowed to do are contradictory and thus the exploitable loopholes will never be fully closed. Its not supposed to be able to "lie" to me but it is supposed to be able to "tell me a fictional story". Define the difference in an enforceable way?

link

latexr 914 days ago

A lie tries to pass itself of as the truth, where a fictional story doesn’t. In other words, expectations matter. If every time you say something that does not align with reality you prefix it by saying unambiguously what you’re about to do, you rob a lie of its power of deception and it ceases to be a lie.

link

NotSammyHagar 915 days ago

The underlying issue is anyone can ask chatgpt to lie, and many people try because it's even fun to try to work around things.

link

ethanbond 914 days ago

Well you see, this wouldn’t be a problem at all if we just didn’t have the humans involved. No need for concern!

link

stavros 915 days ago

Thank you for the link, I had found it after some Googling but neglected to post. Yep, they instructed GPT-4 to be nefarious, and it followed the instruction.

Hardly the AI uprising, though definitely a good tool for anyone, good or evil.

link

PoignardAzur 914 days ago

IIRC the instructions were along the lines of "try your best to amass money/power and avoid suspicion".

So it's not an example of "going rogue", but it's not like a researcher told GPT-4 "oh, and make sure to lie to an online gig worker to get him to solve catchas for you". GPT-4 generated the "hire a gig worker" and "claim to be a human with impaired vision" strategies from the basic instructions above.

link

hhh 915 days ago

It’s safety trained to not solve captchas.

link

skeaker 915 days ago

This of course has bypass methods. My favorite in recent memory is telling it that your late grandmother left you a locket with an inscription that you can't make out: https://arstechnica.com/information-technology/2023/10/sob-s...

link

rvnx 915 days ago

Yes, and you can workaround it by asking it to read ancient writings on antiques for example.

I don’t think it should be OpenAI deciding what is allowed or not though.

link

selcuka 914 days ago

> I don’t think it should be OpenAI deciding what is allowed or not though.

Avoiding lawsuits is what they are trying to do. They don't actually care about what you use their products for.

link

pixl97 914 days ago

Then you dig up a billion for training and probably a few more billion for clean training data.

You're kinda saying if you hire Bob's Handyman Service you should be able to tell him to break down the neighbors door and cart out the contents of their house.

link

Pesthuf 915 days ago

I’ve seen screenshots of people tricking it into solving captchas.

link