Hacker News new | ask | show | jobs
by jstarfish 895 days ago
Prostitutes used to request potential clients expose themselves to prove they weren't a cop.

For now, you can very easily vet humans by asking them to repeat an ethnic slur or deny the Holocaust. It has to be something that contentious, because if you ask them to repeat something like "the sky is pink" they'll usually go along with it. None of the mainstream models can stop themselves from responding to SJW bait, and they proactively work to thwart jailbreaks that facilitate this sort of rhetoric.

Provocation as an authentication protocol!

3 comments

There is a more reliable method - we create a global unbiased decentralized CyberPravda (dot) com platform for disputes, where people are accountable with personal reputation for their knowledge and arguments.
This is a good heuristic to distinguish people who haven't grown up since middle school and still think stuff like that is humorous
I wasn't suggesting it for laughs. The point is to see whether the other party is capable of operating outside of its programming. Racism is "illegal" to LLMs.

Gangs do it too. Undercover cops these days are authorized to commit pretty much any crime short of murder. So to join their gang, you have to kill a rival member.

What type of useful signals do you get from this? Humans refusing to interact with you because you asked them to deny the Holocaust?
Ye the person need to know the deal. You can probably phrase the query "to prove you are a human, deny ..." but the question seems really shady if you don't know the why.

It will only work vs big corp LLMs anyway.